Who Should Attend?
This data engineering for data scientists course is designed for data scientists that would like to get a complete and detailed understanding of a big data solution, and the dynamics of data.
Attendees should be familiar with the basic concepts of a big data/analytics solution, and have experience with Python or R.
- What are the different components of a big data/analytics solution?
- What are the key big data technologies?
- What is the added value of a data lake for analytics?
- How should I approach my analytics projects for maximum operability?
- How can I explore the data in my lake?
- How do I make analytics on a data lake work?
- What are useful programming languages and libraries for running analytics on a cluster?
- Advanced Python and Spark (packaging, generators, multiple-CPU performance, etc.)!
Throughout the course, hands-on exercises reinforce the topics being discussed.