In July 2024 I was contacted by a headhunter looking to hire a Python Engineer for the Man Group. In order to prepare for an interview I spent a few days studying the tech stack that was listed in the job description [3]. Nothing really came of the interview, but I did come up with a possible overall architecture diagram (see below) and found out that Man AHL are largely open source. This repo is a place to put stuff as I learn more about their tech stack.
The folowing technologies have been mentioned in connection to Man Group in various open sources:
Component | Source | Description | Logo | Notes |
---|---|---|---|---|
ENVIRONMENT | ||||
Bare Metal | 2. | Hardware | ||
Linux | 5. | OS | ||
OpenStack | 5. | Cloud IaaS. Basically, soft for provisioning bare metal into cloud resources | ||
Prometheus | 5. | Server monitoring soft | Mentioned in close association with Grafana in source 1. | |
Grafana | 5. | Data visualisation soft | Mentioned in close association with Prometheus in source 1. Probably used to visualise output from Prometheus for continuous server monitoring | |
INFRASTRUCTURE | ||||
Docker | 5. | Containerisation soft, allows to deploy standardised images of other softs hardware agnostically | ||
Kubernetes | 5. | Container management soft | ||
Ansible | 5. | Infra as code, provisioning soft | ||
DATASOURCE | ||||
RMDS/TREP | 1. | Reuters | Mentioned in close association with Kafka in source 1. | |
Kafka | 5. | Event logger | Mentioned in close association with RMDS/TREP in source 1. Probably used to log tick data into files | |
STORAGE | ||||
ArcticDB | 4. | Man's in-house open source data storage solution | Mentioned in close association with MongoDB in source 1. | |
MongoDB | 5. | File based DB | Mentioned in close association with ArcticDB in source 1. | |
Oracle | 5. | Relational DB | ||
MINING | ||||
Elasticsearch | Indexing/search engine | Infering that Man uses this from ELKDocker, as it looks like a fitting solution to index datafiles in Mongo | ||
Spark | 1. | Analytics engine for large-scale data | Probably both Apache Spark and PySpark | |
Airflow | 5. | Workflow management soft for data engineering pipelines | ||
PROGRAMMING | ||||
Python3 - numpy | 5. | Array maths library | ||
Python3 - scipy | 5. | Scientific computing library | ||
Python3 - pandas | 5. | Dataframe manipulation library | ||
Python3 - scikit-learn | 5. | Machine learning library | ||
Python3 - TensorFlow | 5. | Machine learning library | ||
Java | 5. | General purpose programming language | ||
C/C++ | 1. | General purpose programming languages | ||
Bash | General purpose scripting languages | Unconfirmed, but Airfow uses bash to integrate with non-python scripts and source 4. mentions scirpting | ||
CI/CD | ||||
Jenkins | 5. | Development pipeline automation soft | ||
Bitbucket | 5. | Repository | ||
Jira | Issue management soft | Unconfirmed, but BitBucket has tight integration with Jira and source 4. mentions agile | ||
TBD | ||||
ELKDocker | 1. | Is this deviantony/docker-elk ? |
In order to get a better grasp on the technologies listed above, I have compiled a list of learning resources, available here.
- JD – Python Engineer – Market Data Platform.pdf (non-public)
- PyData: James Blackburn - Python and MongoDB as a Platform for Financial Market Data, last accessed 2024-06-15, https://www.youtube.com/watch?v=FVyIxdxsyok
- LinedIn_Ad_15-06-2024 14-09-47.png
- ArcticDB homepage, last accessed 2024-06-15, https://arcticdb.io/
- LinedIn_Ad_19-06-2024 20-39-27.png