Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bucket accessible from Superset to host data. #123

Open
2 tasks
martinpovolny opened this issue Feb 11, 2021 · 3 comments
Open
2 tasks

Bucket accessible from Superset to host data. #123

martinpovolny opened this issue Feb 11, 2021 · 3 comments
Assignees

Comments

@martinpovolny
Copy link
Collaborator

martinpovolny commented Feb 11, 2021

As a Data Scientists,
I want r/w access to a bucket, which is connected to superset
so that I can visualize data in that bucket in a Superset dashboard.

Acceptance Criteria

  • the bucket is available on op1st/moc
  • access credential is documented as a secret in some ops repo
@martinpovolny
Copy link
Collaborator Author

/assign @martinpovolny

@martinpovolny martinpovolny changed the title Aa a Data Scientists, Bucket accessible from Superset to host data. Feb 11, 2021
@martinpovolny
Copy link
Collaborator Author

martinpovolny commented Feb 25, 2021

Status update

We already have a bucket for the project, created here: #111 and here: #115

Unfortunately, if we are to use it also with workflows we need to have a stable name, therefore it's being reamed here:
#131

Bucket access

This bucket is accessible using credentials that are stored in a configmap and a secret named the same as the bucket claim. In the same project (this app's project). This workes (tested).

Permissions are set for the DS group so workflows and people can use these to access the bucket.

I have not tested if we can access the bucket from Superset. There might be a different bucket that is pre-configured in superset and hue, we had a discussion on this with @tumido : operate-first/support#23 (comment)

Documentation

Here's a documentation issue for the bucket use operate-first/support#48
Here are the steps needed to create a bucket for a project: operate-first/support#48 (comment)

TODO: Do we have some doc for accessing the buckets from superset and hue? (should be on the OperateFirst site)

Doc on accessing superset and hue (passwords) is here:
https://www.operate-first.cloud/users/support/

  • How to configure bucket access for Superset?
  • How to configure bucket access for Hue?

Other related information:

There's also some S3 interface provided by MOC mentioned here: open-infrastructure-labs/ops-issues#33

@hemajv
Copy link
Collaborator

hemajv commented Mar 18, 2021

In order to create dashboards in Superset, the workflow we have followed in the past is:

store data in Ceph bucket -> create table in Hue for this data -> use the table in Superset to create dashboards

So we would also require Hue to have access to the bucket i.e. the s3 connection needs to be setup in Hue so that we can create tables for the data stored in the bucket. Currently, however there seems to be some issues due to which we are unable to create the tables in Hue, see issue: operate-first/support#131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants