A fellow front-end engineer asked me for good starting points to learn AI and ML. That’s when I realized that the current stack of technologies a ML Engineer / Data Scientist needs to know is quite large. Some of the technologies I use on a daily basis include:
- Python: the underlying programming language
- JupyterLab: for fast experimentation
- Pandas: for small data processing
- Apache Spark / Beam: for big data processing
- ScikitLearn: for ML basics, data splitting, and metrics
- PyTorch: for neural networks
- Langchain: for LLM chain building
- Streamlit / Plotly: for debug interfaces
- Productionization: Docker, Kubernetes, Cloud platforms (AWS, GCP, Azure), Terraform, FastAPI
In addition, there are numerous other tools and technologies I use for logging, monitoring, alerting, cloud-specific services, dbt, various databases, BI tools, analytics, dependency management, Kafka, and more.
It can be overwhelming, especially considering that almost every day a new and improved tool or technology emerges.
Happy learning!