This is a Singer tap that produces JSON-formatted data following the Singer spec.
This tap:
- Pulls raw data from Mixpanel's REST API
- Extracts the following resources from Mixpanel:
- Outputs the schema for each resource
- Incrementally pulls data based on the input state and configtwin
-
Install
> pip install tap-mixpanel
-
Get your Mixpanel API Secret
Login to your Mixpanel account, click your name in the top right. Click project settings. You'll see your API secret there.
-
Create the config file
Create a JSON file called
config.json
containing the api secret and the name events you want to pull in the following format.{ "api-secret": "your-api-secret", "raw-data": "true", "events": "true", "event-names": ["event1", "event2"], "start-date": "2017-05-10T00:00:00Z", "end-date":"2017-05-11T00:00:00Z" #optional }
raw-data(required): Determines whether or not to sync Mixpanel raw data. This is a bool
events(required): Determines whether or not to sync Mixpanel raw data. This is a bool.
event-names(required if events is true): Array must be populated with valid event names. This is how the Mixpanel API works, it expects an array of event names.
start-date(required): Determines pulls data from after that day
end-date(optional): Determines limits the data to the days between start-date and end-date. If no end-date is provided then the default is the current day. -
[Optional] Create the initial state file
You can provide JSON file that contains a start date to pull data from. This will override the required
start-date
in the config file. The state is output after the program is run to stdout with a new state file where the oldend-date
becomes the new state'sstart-date
. See the Singer documentation for more information on states.{"start-date": "2017-01-17T20:00:00Z"}
-
Run the application
tap-mixpanel
can be run with:tap-mixpanel --config config.json [--state state.json]
-
[Optional] Save state
› tap-mixpanel --config config.json --state state.json >> state.json › tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
Keep in mind that if you use this feature you'll need to update the end-date in the config or else you'll end up pulling data from only one day.
- Events
Events currently does not sort the returned dates so the output can be a little wonky.
- Raw Data
The mixpanel API is super slow at returning the raw data export. Be prepared to wait if
you're trying to pull a large amount of data. Also, events can have any number of custom properties so you cannot depend on each event object returned to have the same number of properties.
This can cause some weirdness if you use target-csv