DATA SCIENCE/ML/AI GENERAL
~okaerinasai mase, goshujin-sama
Useful links
Relational Database https://www.freecodecamp.org/learn/relational-database/
Scientific Computing with Python https://www.freecodecamp.org/learn/scientific-computing-with-python/
Data Analysis with Python https://www.freecodecamp.org/learn/data-analysis-with-python/
Machine Learning with Python https://www.freecodecamp.org/learn/machine-learning-with-python/
All the courses mentioned above are free including certification. They are relatively easy projects and you don't need to see all the classes, you can jump straight to the project and claim your certificate.
Learn Data Science 6h - FreeCodeCamp https://www.youtube.com/watch?v=ua-CiDNNj30
Learn Data Science in 10 Hours - Edureka https://www.youtube.com/watch?v=-ETQ97mXXF0
The videos above work as a summary on Data Science for those who have not had contact. I highly recommend it, you can enable translated subtitles as well if you don't speak English.
Roadmap
1. Programming
- Data Types, Variables, and Operators
- Control Flow and Loops
- Functions and Modules
- Object-Oriented Programming
- Exception Handling
2. Database Management
- Relational Database Management Systems (RDBMS)
- SQL Fundamentals
- NoSQL Databases
- Data Warehousing
3. Data Modeling
- Conceptual, Logical, and Physical Data Modeling
- Entity-Relationship Diagrams (ERD)
- Normalization
4. Data Integration
- Extract, Transform, and Load (ETL) Process
- Data Integration Tools
- Data Migration
5. Data Processing
- Data Pipelines
- Batch Processing
- Stream Processing
- Apache Spark
6. Cloud Computing
- Cloud Platforms
- Cloud Data Warehouses
- Cloud Data Lakes
- Cloud Services (AWS, Azure, GCP)
7. Data Governance
- Data Quality
- Data Security
- Data Privacy
- Compliance and Regulations
8. Big Data
- Hadoop Ecosystem
- MapReduce
- Apache Hive
- Apache Pig
9. Data Visualization
- Data Visualization Tools
- Dashboarding Tools
- Business Intelligence (BI) Tools
10. Machine Learning
- Machine Learning Fundamentals
- Feature Engineering
- Model Training
- Model Deployment
Projects for portfolio
- Building a Data Pipeline: Develop an end-to-end data pipeline that extracts data from a source, transforms it, and loads it into a destination. Use tools like Apache Kafka, Apache Spark, and AWS S3 to build the pipeline
- Building a Data Warehouse: Design and build a data warehouse from scratch. Use tools like Snowflake, AWS Redshift, or Google BigQuery to create the data warehouse. Load data from different sources and build a reporting dashboard to visualize the data
- Building a Real-time Streaming System: Build a real-time streaming system that processes data as it arrives. Use tools like Apache Kafka, Apache Flink, and Apache NiFi to build the system. Use the system to process data from a source and visualize the data in real-time