Data Science from scratch: import analysis
Posted: Sun Dec 22, 2024 6:00 am
The final project Data Science from Scratch is from the Data Science course and also my first Data Science
project . I wanted to combine the new knowledge acquired during the course with my
professional knowledge. I have a degree in business administration and
have worked with Imports and Exports for many years.
I wanted to take the opportunity to set up a data science project from scratch, to see what it would be like
from A to Z. My final project aimed to analyze some raw import data from the state of Paraná from Comex Stat using Amazon
's cloud service - AWS .
Comex Stat is the free portal for consulting french email address list foreign trade statistics
, from the Ministry of Economy. Detailed
Comex Stat logodata
on Brazilian exports and
imports are published monthly, based on
declarations made by exporters and importers
in SISCOMEX (Integrated Foreign Trade System).
Amazon Web Services (AWS) has been around since 2006 and offers over
175 complete data center services for companies through
cloud computing. According to AWS itself, this global cloud infrastructure is the most secure, comprehensive, and reliable platform available. AWS aims to reduce costs and provide computing scale. aws logo
You choose the services you want to use and how you want to use them, paying only for what you use
and canceling them when you no longer need them. So cost-effectiveness, scalability and
elasticity are its strong points. These services help companies to
be more productive and focus on their core business. They are used by
hundreds of thousands of companies in 190 countries around the world.
PROJECT STEPS – Data Science from Scratch
The Data Science from Scratch project consists of several stages, which I will detail throughout the article:
1. Create an account on Amazon Web Services – AWS
2. Create Database on Amazon RDS – Relational Database Service
3. Connect to a PostgreSQL database instance
4. Which files will be analyzed?
5. PgAdmin 4 – “Import file”
6. Analysis of data from tables
7. Visualization with Power BI
So let's get started with the Data Science from Scratch project, I'll start with an overview of the virtual environment.
Virtual environment overview
What is a VPC ?
– Virtual Private Cloud is the creation of a private cloud within the public cloud,
which is used in the AWS operating structure. A VPC is created when
you create an AWS account, which was the First Step of the
project.
1. Create an account on Amazon Web Services – AWS
Creating an AWS account is easy! Just follow the instructions on the website. You will need
to provide an international credit card number (*) so that your data
can be authenticated and to ensure that Amazon can receive payment for its
services. In this case, the amount of US$1.00 will be debited from your credit card for
validation. After that, a verification code will be sent via SMS to complete the
registration. And that's it! The root account has been created! When you register, you will
automatically have access to all AWS services.
(*) Starting November 1, 2020, AWS will launch Amazon AWS Serviços Brasil Ltda.
(AWS SBL) to operate in Brazil, as a local entity to provide AWS services and
bill Brazilian customers. In other words, it will start accepting other forms of
payment, including through national credit cards.
The AWS documentation is quite extensive, explaining step by step how to create and
use the available services. It also offers suggestions for best practices for use.
An example of this is the creation of the Administrator User in IAM – Identity and Access
Management. Then you add the user to an administrator group, which
allows you to create groups with multiple users and individual passwords, with
specific permissions. Just like Professor Charles created for us in the SQL class.
project . I wanted to combine the new knowledge acquired during the course with my
professional knowledge. I have a degree in business administration and
have worked with Imports and Exports for many years.
I wanted to take the opportunity to set up a data science project from scratch, to see what it would be like
from A to Z. My final project aimed to analyze some raw import data from the state of Paraná from Comex Stat using Amazon
's cloud service - AWS .
Comex Stat is the free portal for consulting french email address list foreign trade statistics
, from the Ministry of Economy. Detailed
Comex Stat logodata
on Brazilian exports and
imports are published monthly, based on
declarations made by exporters and importers
in SISCOMEX (Integrated Foreign Trade System).
Amazon Web Services (AWS) has been around since 2006 and offers over
175 complete data center services for companies through
cloud computing. According to AWS itself, this global cloud infrastructure is the most secure, comprehensive, and reliable platform available. AWS aims to reduce costs and provide computing scale. aws logo
You choose the services you want to use and how you want to use them, paying only for what you use
and canceling them when you no longer need them. So cost-effectiveness, scalability and
elasticity are its strong points. These services help companies to
be more productive and focus on their core business. They are used by
hundreds of thousands of companies in 190 countries around the world.
PROJECT STEPS – Data Science from Scratch
The Data Science from Scratch project consists of several stages, which I will detail throughout the article:
1. Create an account on Amazon Web Services – AWS
2. Create Database on Amazon RDS – Relational Database Service
3. Connect to a PostgreSQL database instance
4. Which files will be analyzed?
5. PgAdmin 4 – “Import file”
6. Analysis of data from tables
7. Visualization with Power BI
So let's get started with the Data Science from Scratch project, I'll start with an overview of the virtual environment.
Virtual environment overview
What is a VPC ?
– Virtual Private Cloud is the creation of a private cloud within the public cloud,
which is used in the AWS operating structure. A VPC is created when
you create an AWS account, which was the First Step of the
project.
1. Create an account on Amazon Web Services – AWS
Creating an AWS account is easy! Just follow the instructions on the website. You will need
to provide an international credit card number (*) so that your data
can be authenticated and to ensure that Amazon can receive payment for its
services. In this case, the amount of US$1.00 will be debited from your credit card for
validation. After that, a verification code will be sent via SMS to complete the
registration. And that's it! The root account has been created! When you register, you will
automatically have access to all AWS services.
(*) Starting November 1, 2020, AWS will launch Amazon AWS Serviços Brasil Ltda.
(AWS SBL) to operate in Brazil, as a local entity to provide AWS services and
bill Brazilian customers. In other words, it will start accepting other forms of
payment, including through national credit cards.
The AWS documentation is quite extensive, explaining step by step how to create and
use the available services. It also offers suggestions for best practices for use.
An example of this is the creation of the Administrator User in IAM – Identity and Access
Management. Then you add the user to an administrator group, which
allows you to create groups with multiple users and individual passwords, with
specific permissions. Just like Professor Charles created for us in the SQL class.