Pulsifi is looking for an experienced Data Engineer to work with our team of Data Scientists and IT Engineers in the development of our product. You are expected to contribute to data extraction and aggregation strategies, and to work alongside the data science and IT teams in the execution of projects.
This is a rare and unique opportunity for you to be a key contributor to the development of a significant product that will impact the future of HR globally. If you are passionate about truly changing the world and making a big impact, building great products, working, and thriving in an environment of focused and committed teams, we would love to talk to you.
RESPONSIBILITIES
- Create and build data pipeline architecture for optimal data collection.
- Tight collaboration with the technical team (software engineers and data scientists) to ensure optimal data delivery which is consistent throughout ongoing projects.
- Strategize and create web crawling tools.
- Identify, design, and implement internal process improvements such as automating manual processes, data delivery, and re-designing cloud infrastructure for greater scalability.
- Build the infrastructure required for data extraction, transformation and loading (ETL) from a wide variety of data sources using SQL, NoSQL, or unstructured data technologies.
- Work with stakeholders including C-Suite and tech team to assist with data-related technical issues and support their data infrastructure needs.
- Create data tools for analytics and assisting data scientists in building and optimizing our product into an innovative leader.
REQUIREMENTS
- Bachelor’s degree required in computer science, electrical engineering, informatics, or any other quantitative field.
- Have at least 2 – 3 years of experience as a Data Engineer.
- Possess advanced working knowledge of SQL and experience in relational databases, as well as working familiarity with a variety of other NoSQL technologies (MongoDB, HBase, DynamoDB, Cassandra, Redis, Neo4J, etc).
- Experience in building and optimizing ‘big data’ data pipelines, architectures, and data sets.
- Strong analytic skills related to working with unstructured datasets.
- Experience with big data tools: Pytorch, Spacy, Hadoop, Spark, Kafka, etc.
- Proficient in different scripting languages: Python, Java, C++, Scala, etc.
- Experience in building processes to support data extraction, transformation, and loading (ETL).
- History of manipulating, processing, and extracting values from large disconnected datasets.
- Experience in AWS cloud services.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- Is highly motivated and self-directed.
- Excited by the prospect of optimizing and re-designing our company’s data architecture as we scale up.