I am an Applied Scientist, currently working in Zalando (Berlin, Germany). Throughout my career I have contributed to the design and development of several machine learning projects, spanning from product localisation to optimising onsite marketing campaigns. In the past I also conducted independent research in the areas of natural language processing and statistical modelling. I have strong interest in building complex systems using Python, Spark, as well as cloud technologies such as AWS. I am a career changer, having a background in humanities and music.
Learn more about handling Big Data with PySpark and why PySpark might become your to-go framework for Exploratory Data Analysis (EDA) and Feature Engineering (FE)! This talk will show the main differences between Pandas and PySpark frameworks and outline the advantages of performing EDA and FE with PySpark. It will be most beneficial for Data Scientists, Data Analysts, and other attendees that perform data exploration and wrangling. They will learn about handling Big Data with PySpark, its functionalities for EDA and FE, and means of visualising the results. This talk is a good fit for the Beginner to Intermediate level audience with prior experience in Python and SQL.