Skip to content

Commit

Permalink
Organize Enterprise docs (#7702)
Browse files Browse the repository at this point in the history
* Organize Enterprise docs

* PR fixes

* Fix Enterprise docs

* idp
  • Loading branch information
itaiad200 authored May 9, 2024
1 parent f03e8b6 commit 6065e78
Show file tree
Hide file tree
Showing 11 changed files with 2,041 additions and 32 deletions.
1,450 changes: 1,450 additions & 0 deletions docs/assets/img/enterprise/enterprise-arch.excalidraw

Large diffs are not rendered by default.

Binary file added docs/assets/img/enterprise/enterprise-arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 2 additions & 11 deletions docs/reference/security/external-principals-aws.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,17 +57,8 @@ It's also important to note that Amazon does NOT appear to include any sort of a
* in lakeFS `auth.authentication_api.external_principals_enabled` must be set to `true` in the configuration file, other configuration (`auth.authentication_api.*`) can be found at at [configuration reference]({% link reference/configuration.md %})

**fluffy server configuration reference:**

* `auth.external.aws_auth.enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's.
* `auth.external.aws_auth.get_caller_identity_max_age` `(duration : 15m)` - The maximum age in seconds for the GetCallerIdentity request to be valid, the max is 15 minutes enforced by AWS, smaller TTL can be set.
* `auth.authentication_api.external_principals_enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's.
* `auth.external.aws_auth.valid_sts_hosts` `([]string)` - The default are all the valid AWS STS hosts (`sts.amazonaws.com`, `sts.us-east-2.amazonaws.com` etc).
* `auth.external.aws_auth.required_headers` `(map[string]string : )` - Headers that must be present by the client when doing login request (e.g `X-LakeFS-Server-ID: <lakefs.ingress.domain>`).
* `auth.external.aws_auth.optional_headers` `(map[string]string : )` - Optional headers that can be present by the client when doing login request.
* `auth.external.aws_auth.http_client.timeout` `(duration : 10s)` - The timeout for the HTTP client used to communicate with AWS STS.
* `auth.external.aws_auth.http_client.skip_verify` `(bool : false)` - Skip SSL verification with AWS STS.

For the full list of the Fluffy server configuration, see [Fluffy Configuration]({% link understand/enterprise/fluffy-configuration.md %}) under `auth.external.aws_auth`


{: .note}
> By default lakeFS clients will add the parameter `X-LakeFS-Server-ID: <lakefs.ingress.domain>` to the initial [login request][login-api] for STS.
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/security/rbac.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ lakeFS Enterprise


{: .note}
> RBAC is available on [lakeFS Cloud]({% link understand/lakefs-cloud.md %}) and [lakeFS Enterprise]({% link understand/lakefs-enterprise.md %}).
> RBAC is available on [lakeFS Cloud]({% link understand/lakefs-cloud.md %}) and [lakeFS Enterprise]({% link understand/enterprise/index.md %}).
>
> If you're using the open source version of lakeFS then the [ACL-based authorization mechanism](access-control-lists.html) is an alternative to RBAC.
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/security/sso.md
Original file line number Diff line number Diff line change
Expand Up @@ -340,8 +340,8 @@ Notes:
* Check the [examples on GitHub](https://github.com/treeverse/charts/tree/master/examples/lakefs/enterprise) we provide for each authentication method (oidc/adfs/ldap + rbac).
* The examples are provisioned with a Postgres pod for quick-start, make sure to replace that to a stable database once ready.
* The encrypt secret key `secrets.authEncryptSecretKey` is shared between fluffy and lakeFS for authentication.
* The lakeFS `image.tag` must be >= 0.100.0
* The fluffy `image.tag` must be >= 0.2.0
* The lakeFS `image.tag` must be >= 1.0.0
* The fluffy `image.tag` must be >= 0.2.7
* Change the `ingress.hosts[0]` from `lakefs.company.com` to a real host (usually same as lakeFS), also update additional references in the file (note: URL path after host if provided should stay unchanged).
* Update the `ingress` configuration with other optional fields if used
* Fluffy docker image: replace the `fluffy.image.privateRegistry.secretToken` with real token to dockerhub for the fluffy docker image.
Expand Down
38 changes: 38 additions & 0 deletions docs/understand/enterprise/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: Enterprise Architecture
description: Understand lakeFS Enterprise Architecture
parent: lakeFS Enterprise
grand_parent: Understanding lakeFS
---

# Architecture

![img.png](../../assets/img/enterprise/enterprise-arch.png)

[1] Any user request to lakeFS via Browser or Programmatic access (SDK, HTTP
API, lakectl).

[2] Reverse Proxy (e.g. NGINX, Traefik, K8S Ingress): will handle user requests
and proxy between lakeFS server and fluffy server based on the path prefix
while maintaining the same host.

[3] lakeFS server - the main lakeFS service.

[4] fluffy server - service that is responsible for the Enterprise features.,
it is separated by ports for security reasons.

1. SSO auth (i.e Browser login via Azure AD, Okta, Auth0), default port 8000.
1. RBAC authorization, default port 9000.

[5] The [KV Store]({% link understand/architecture.md %}) - Where metadata is stored used both by lakeFS and fluffy.

[6] SSO IdP - Identity provider (e.g. Azure AD, Okta, JumpCloud). fluffy
implements SAML and Oauth2 protocols.


For more details and pricing, please [contact sales](https://lakefs.io/contact-sales/).


**Note:** Setting up lakeFS enterprise with an SSO IdP (OIDC, SAML or LDAP) requires
configuring access from the IdP too.
{: .note }
128 changes: 128 additions & 0 deletions docs/understand/enterprise/fluffy-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
---
title: Fluffy Server Configuration
description: Configuration reference for Fluffy Server
parent: lakeFS Enterprise
grand_parent: Understanding lakeFS
redirect_from:
- /understand/fluffy-configuration.html
---

# Fluffy Server Configuration

{% include toc.html %}

Configuring Fluffy using a YAML configuration file and/or environment variables.
The configuration file's location can be set with the '--config' flag. If not specified, the first file found in the following order will be used:
1. ./config.yaml
1. `$HOME`/fluffy/config.yaml
1. /etc/fluffy/config.yaml
1. `$HOME`/.fluffy.yaml

Configuration items can be controlled by environment variables, see [below](#using-environment-variables).


## Reference

This reference uses `.` to denote the nesting of values.

* `logging.format` `(one of ["json", "text"] : "text")` - Format to output log message in
* `logging.level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "INFO")` - Logging level to output
* `logging.audit_log_level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "DEBUG")` - Audit logs level to output.

**Note:** In case you configure this field to be lower than the main logger level, you won't be able to get the audit logs
{: .note }
* `logging.output` `(string : "-")` - A path or paths to write logs to. A `-` means the standard output, `=` means the standard error.
* `logging.file_max_size_mb` `(int : 100)` - Output file maximum size in megabytes.
* `logging.files_keep` `(int : 0)` - Number of log files to keep, default is all.
* `logging.trace_request_headers` `(bool : false)` - If set to `true` and logging level is set to `TRACE`, logs request headers.
* `listen_address` `(string : "0.0.0.0:8000")` - A `<host>:<port>` structured string representing the address to listen on
* `database` - Configuration section for the Fluffy key-value store database. The database must be shared between lakeFS & Fluffy
+ `database.type` `(string ["postgres"|"dynamodb"|"local"] : )` - Fluffy database type
+ `database.postgres` - Configuration section when using `database.type="postgres"`
+ `database.postgres.connection_string` `(string : "postgres://localhost:5432/postgres?sslmode=disable")` - PostgreSQL connection string to use
+ `database.postgres.max_open_connections` `(int : 25)` - Maximum number of open connections to the database
+ `database.postgres.max_idle_connections` `(int : 25)` - Maximum number of connections in the idle connection pool
+ `database.postgres.connection_max_lifetime` `(duration : 5m)` - Sets the maximum amount of time a connection may be reused `(valid units: ns|us|ms|s|m|h)`
+ `database.dynamodb` - Configuration section when using `database.type="dynamodb"`
+ `database.dynamodb.table_name` `(string : "kvstore")` - Table used to store the data
+ `database.dynamodb.scan_limit` `(int : 1025)` - Maximal number of items per page during scan operation

**Note:** Refer to the following [AWS documentation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.Limit) for further information
{: .note }
+ `database.dynamodb.endpoint` `(string : )` - Endpoint URL for database instance
+ `database.dynamodb.aws_region` `(string : )` - AWS Region of database instance
+ `database.dynamodb.aws_profile` `(string : )` - AWS named profile to use
+ `database.dynamodb.aws_access_key_id` `(string : )` - AWS access key ID
+ `database.dynamodb.aws_secret_access_key` `(string : )` - AWS secret access key
+ **Note:** `endpoint` `aws_region` `aws_access_key_id` `aws_secret_access_key` are not required and used mainly for experimental purposes when working with DynamoDB with different AWS credentials.
{: .note }
+ `database.dynamodb.health_check_interval` `(duration : 0s)` - Interval to run health check for the DynamoDB instance (won't run if equal to 0).
+ `database.local` - Configuration section when using `database.type="local"`
+ `database.local.path` `(string : "~/fluffy/metadata")` - Local path on the filesystem to store embedded KV metadata
+ `database.local.sync_writes` `(bool: true)` - Ensure each write is written to the disk. Disable to increase performance
+ `database.local.prefetch_size` `(int: 256)` - How many items to prefetch when iterating over embedded KV records
+ `database.local.enable_logging` `(bool: false)` - Enable trace logging for local driver
* `auth` - Configuration section for the Fluffy authentication services, like SAML or OIDC.
+ `auth.encrypt.secret_key` `(string : required)` - Same value given to lakeFS. A random (cryptographically safe) generated string that is used for encryption and HMAC signing
+ `auth.logout_redirect_url` `(string : "/auth/login")` - The address to redirect to after a successful logout, e.g. login.
+ `auth.post_login_redirect_url` `(string : '')` - Required when SAML is enabled. The address to redirect after a successful login. For most common configurations, setting to `/` will redirect to lakeFS homepage.
+ `auth.serve_listen_address` `(string : '')` - If set, an endpoint serving RBAC requests binds to this address.
+ `auth.serve_disable_authentication` `(bool : false)` - Unsafe. Disables authentication to the RBAC server.
+ `auth.ldap`
+ `auth.ldap.server_endpoint` `(string : required)` - The LDAP server address, e.g. 'ldaps://ldap.company.com:636'
+ `auth.ldap.bind_dn` `(string : required)` - The bind string, e.g. 'uid=<bind-user-name>,ou=Users,o=<org-id>,dc=<company>,dc=com'
+ `auth.ldap.bind_password` `(string : required)` - The password for the user to bind.
+ `auth.ldap.username_attribute` `(string : required)` - The user name attribute, e.g. 'uid'
+ `auth.ldap.user_base_dn` `(string : required)` - The search request base dn, e.g. 'ou=Users,o=<org-id>,dc=<company>,dc=com'
+ `auth.ldap.user_filter` `(string : required)` - The search request user filter, e.g. '(objectClass=inetOrgPerson)'
+ `auth.ldap.connection_timeout_seconds` `(int : required)` - The timeout for a single connection
+ `auth.ldap.request_timeout_seconds` `(int : required)` - The timeout for a single request
+ `auth.saml` Configuration section for SAML
+ `auth.saml.enabled` `(bool : false)` - Enables SAML Authentication.
+ `auth.saml.sp_root_url` `(string : '')` - The base lakeFS-URL, e.g. 'https://<lakefs-url>'
+ `auth.saml.sp_x509_key_path` `(string : '')` - The path to the private key, e.g '/etc/saml_certs/rsa_saml_private.cert'
+ `auth.saml.sp_x509_cert_path` `(string : '')` - The path to the public key, '/etc/saml_certs/rsa_saml_public.pem'
+ `auth.saml.sp_sign_request` `(bool : 'false')` SPSignRequest some IdP require the SLO request to be signed
+ `auth.saml.sp_signature_method` `(string : '')` SPSignatureMethod optional valid signature values depending on the IdP configuration, e.g. 'http://www.w3.org/2001/04/xmldsig-more#rsa-sha256'
+ `auth.saml.idp_metadata_url` `(string : '')` - The URL for the metadata server, e.g. 'https://<adfs-auth.company.com>/federationmetadata/2007-06/federationmetadata.xml'
+ `auth.saml.idp_skip_verify_tls_cert` `(bool : false)` - Insecure skip verification of the IdP TLS certificate, like when signed by a private CA
+ `auth.saml.idp_authn_name_id_format` `(string : 'urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified')` - The format used in the NameIDPolicy for authentication requests
+ `auth.saml.idp_request_timeout` `(duration : '10s')` The timeout for remote authentication requests.
+ `auth.saml.external_user_id_claim_name` `(string : '')` - The claim name to use as the user identifier with an IdP mostly for logout
+ `auth.oidc` Configuration section for OIDC
+ `auth.oidc.enabled` `(bool : false)` - Enables OIDC Authentication.
+ `auth.oidc.url` `(string : '')` - The OIDC provider url, e.g. 'https://oidc-provider-url.com/'
+ `auth.oidc.client_id` `(string : '')` - The application's ID.
+ `auth.oidc.client_secret` `(string : '')` - The application's secret.
+ `auth.oidc.callback_base_url` `(string : '')` - A default callback address of the Fluffy server.
+ `auth.oidc.callback_base_urls` `(string[] : '[]')`
+ **Note:** You may configure a list of URLs that the OIDC provider may redirect to. This allows lakeFS to be accessed from multiple hostnames while retaining federated auth capabilities.
If the provider redirects to a URL not in this list, the login will fail. This property and callback_base_url are mutually exclusive.
{: .note }
+ `auth.oidc.authorize_endpoint_query_parameters` `(bool : map[string]string)` - key/value parameters that are passed to a provider's authorization endpoint.
+ `auth.oidc.logout_endpoint_query_parameters` `(string[] : '[]')` - The query parameters that will be used to redirect the user to the OIDC provider after logout, e.g. '[returnTo, https://<lakefs.ingress.domain>/oidc/login]'
+ `auth.oidc.logout_client_id_query_parameter` `(string : '')` - The claim name that represents the client identifier in the OIDC provider
+ `auth.oidc.additional_scope_claims` `(string[] : '[]')` - Specifies optional requested permissions, other than `openid` and `profile` that are being used.
+ `auth.cache` Configuration section for RBAC service cache
+ `auth.cache.enabled` `(bool : true)` - Enables RBAC service cache
+ `auth.cache.size` `(int : 1024)` - Number of users, policies and credentials to cache.
+ `auth.cache.ttl` `(duration : 20s)` - Cache items time to live expiry.
+ `auth.cache.jitter` `(duration : 3s)` - Cache items time to live jitter.
+ `auth.external` - Configuration section for the external authentication methods
+ `auth.external.aws_auth` - Configuration section for authenticating to lakeFS using AWS presign get-caller-identity request: [External Principals AWS Auth]({% link reference/security/external-principals-aws.md %})
+ `auth.external.aws_auth.enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's.
+ `auth.external.aws_auth.get_caller_identity_max_age` `(duration : 15m)` - The maximum age in seconds for the GetCallerIdentity request to be valid, the max is 15 minutes enforced by AWS, smaller TTL can be set.
+ `auth.authentication_api.external_principals_enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's.
+ `auth.external.aws_auth.valid_sts_hosts` `([]string)` - The default are all the valid AWS STS hosts (`sts.amazonaws.com`, `sts.us-east-2.amazonaws.com` etc).
+ `auth.external.aws_auth.required_headers` `(map[string]string : )` - Headers that must be present by the client when doing login request (e.g `X-LakeFS-Server-ID: <lakefs.ingress.domain>`).
+ `auth.external.aws_auth.optional_headers` `(map[string]string : )` - Optional headers that can be present by the client when doing login request.
+ `auth.external.aws_auth.http_client.timeout` `(duration : 10s)` - The timeout for the HTTP client used to communicate with AWS STS.
+ `auth.external.aws_auth.http_client.skip_verify` `(bool : false)` - Skip SSL verification with AWS STS.
{: .ref-list }

## Using Environment Variables

All the configuration variables can be set or overridden using environment variables.
To set an environment variable, prepend `FLUFFY_` to its name, convert it to upper case, and replace `.` with `_`:

For example, `logging.format` becomes `FLUFFY_LOGGING_FORMAT`, `auth.saml.enabled` becomes `FLUFFY_AUTH_SAML_ENABLED`, etc.
39 changes: 39 additions & 0 deletions docs/understand/enterprise/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: lakeFS Enterprise
description: lakeFS Enterprise is an enterprise-ready lakeFS solution providing additional features including RBAC, SSO and Support SLA.
has_children: true
has_toc: false
parent: Understanding lakeFS
redirect_from:
- /understand/lakefs-enterprise.html
- /enterprise/index.html
---

# lakeFS Enterprise

LakeFS Enterprise is a commercially-supported version of the open-source lakeFS project,
offering additional features and functionalities targeted for businesses.
It provides several benefits over the open-source version:

* Security - Advanced Authorization
* [RBAC]({% link reference/security/rbac.md %}) - implements role-based access control to manage user permissions. It allows for fine-grained control by associating permissions with users and groups, granting them specific actions on specific resources. This ensures data security and compliance within an organization.
* Security - Advanced Authentication
* [SSO]({% link reference/security/sso.md %}) - lets users sign in with existing credentials from a trusted provider, eliminating separate logins.
* [STS Auth]({% link reference/security/sts-login.md %}) - offers temporary, secure logins using an Identity Provider, simplifying user access and enhancing security.
* [Authenticate to lakeFS with AWS IAM Roles]({% link reference/security/external-principals-aws.md %}) - lets programs authenticate using AWS IAM roles instead of lakeFS credentials, granting access based on IAM policies.
* Support SLA

Contact Sales (https://lakefs.io/contact-sales/) to get the token for Fluffy.
{: .note}

## Overview

lakeFS Enterprise solution consists of 2 main components:
1. lakeFS - Open Source: [treeverse/lakeFS](https://hub.docker.com/r/treeverse/lakefs),
release info found in [Github releases](https://github.com/treeverse/lakeFS/releases).
2. Fluffy - Proprietary: In charge of the Enterprise features. Can be retrieved from
[Treeverse Dockerhub](https://hub.docker.com/u/treeverse) using the granted token.

You can learn nore about [lakeFS Enterprise architecture]({% link understand/enterprise/architecture.md %}), or
follow the examples in the [quickstart guide]({% link understand/enterprise/orchestration.md %}).

Loading

0 comments on commit 6065e78

Please sign in to comment.