Who are we?
We are a globally expanding software technology company that helps brands communicate more effectively with their audiences. We are looking forward to expand our people capabilities and success in developing high-end solutions beyond existing boundaries and establish our brand as a Global Powerhouse.
We are free to work from wherever we want and go to the office whenever we like!!!
What is the role?
We are looking for a highly skilled and motivated Senior Data Engineer to join our dynamic team. The ideal candidate will have extensive experience in building and managing data pipelines, noSQL databases, and cloud-based data platforms. You will work closely with data scientists and other engineers to design and implement scalable data solutions.
Key Responsibilities:
- Design, build, and maintain scalable data pipelines and architectures.
- Implement data lake solutions on cloud platforms.
- Develop and manage noSQL databases (e.g., MongoDB, Cassandra).
- Work with graph databases (e.g., Neo4j) and big data technologies (e.g., Hadoop, Spark).
- Utilize cloud services (e.g., S3, Redshift, Lambda, Kinesis, EMR, SQS, SNS).
- Ensure data quality, integrity, and security.
- Collaborate with data scientists to support machine learning and AI initiatives.
- Optimize and tune data processing workflows for performance and scalability.
- Stay up-to-date with the latest data engineering trends and technologies.
Detailed Responsibilities and Skills:
- Business Objectives and Requirements:
- Engage with business IT and data science teams to understand their needs and expectations from the data lake.
- Define real-time analytics use cases and expected outcomes.
- Establish data governance policies for data access, usage, and quality maintenance.
- Technology Stack:
- Real-time data ingestion using Apache Kafka or Amazon Kinesis.
- Scalable storage solutions such as Amazon S3, Google Cloud Storage, or Hadoop Distributed File System (HDFS).
- Real-time data processing using Apache Spark or Apache Flink.
- NoSQL databases like Cassandra or MongoDB, and specialized time-series databases like InfluxDB.
- Data Ingestion and Integration:
- Set up data producers for real-time data streams.
- Integrate batch data processes to merge with real-time data for comprehensive analytics.
- Implement data quality checks during ingestion.
- Data Processing and Management:
- Utilize Spark Streaming or Flink for real-time data processing.
- Enrich clickstream data by integrating with other data sources.
- Organize data into partitions based on time or user attributes.
- Data Lake Storage and Architecture:
- Implement a multi-layered storage approach (raw, processed, and aggregated layers).
- Use metadata repositories to manage data schemas and track data lineage.
- Security and Compliance:
- Implement fine-grained access controls.
- Encrypt data in transit and at rest.
- Maintain logs of data access and changes for compliance.
- Monitoring and Maintenance:
- Continuously monitor the performance of data pipelines.
- Implement robust error handling and recovery mechanisms.
- Monitor and optimize costs associated with storage and processing.
- Continuous Improvement and Scalability:
- Establish feedback mechanisms to improve data applications.
- Design the architecture to scale horizontally.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 5+ years of experience in data engineering or related roles.
- Proficiency in noSQL databases (e.g., MongoDB, Cassandra) and graph databases (e.g., Neo4j).
- Strong experience with cloud platforms (e.g., AWS, GCP, Azure).
- Hands-on experience with big data technologies (e.g., Hadoop, Spark).
- Proficiency in Python and data processing frameworks.
- Experience with Kafka, ClickHouse, Redshift.
- Knowledge of ETL processes and data integration.
- Familiarity with AI, ML algorithms, and neural networks.
- Strong problem-solving skills and attention to detail.
- Excellent communication and teamwork skills.
- Entrepreneurial spirit and a passion for continuous learning.
Join our team!