Introduction-to-Importing-Data-in-Python

安奇
2023-12-01

1. Introduction and flat files

1.1 Welcome to the course!

1.2 Exploring your working directory

In order to import data into Python, you should first have an idea of what files are in your working directory.

IPython, which is running on DataCamp’s servers, has a bunch of cool commands, including its magic commands. For example, starting a line with ! gives you complete system shell access. This means that the IPython magic command ! ls will display the contents of your current directory. Your task is to use the IPython magic command ! ls to check out the contents of your current directory and answer the following question: which of the following files is in your working directory?

□ \square huck_finn.txt

□ \square titanic.csv

■ \blacksquare moby_dick.txt

1.3 Importing entire text files

In this exercise, you’ll be working with the file moby_dick.txt. It is a text file that contains the opening sentences of Moby Dick, one of the great American novels! Here you’ll get experience opening a text file, printing its contents to the shell and, finally, closing it.

Instruction

  • Open the file moby_dick.txt as read-only and store it in the variable file. Make sure to pass the filename enclosed in quotation marks ''.
  • Print the contents of the file to the shell using the print() function. As Hugo showed in the video, you’ll need to apply the method read() to the object file.
  • Check whether the file is closed by executing print(file.closed).
  • Close the file using the close() method.
  • Check again that the file is closed as you did above
在这里插入代码片

1.4 Importing text files by lines

For large files, we may not want to print all of their content to the shell: you may wish to print only the first few lines. Enter the readline() method, which allows you to do this. When a file called file is open, you can print out the first line by executing ile.readline(). If you execute the same command again, the second line will print, and so on.

In the introductory video, Hugo also introduced the concept of a context manager. He showed that you can bind a variable file by using a context manager construct:

with open('huck_finn.txt') as file:

While still within this construct, the variable file will be bound to open('huck_finn.txt'); thus, to print the file to the shell, all the code you need to execute is:

with open('huck_finn.txt') as file: print(file.readline())
You’ll now use these tools to print the first few lines of moby_dick.txt!

Instruction

  • Open moby_dick.txt using the with context manager and the variable file.
  • Print the first three lines of the file to the shell by using readline() three times within the context manager.
在这里插入代码片

1.5 The Importance of flat files in data science

1.6 Pop quiz: examples of flat files?

You’re now well-versed in importing text files and you’re about to become a wiz at importing flat files. But can you remember exactly what a flat file is? Test your knowledge by answering the following question: which of these file types below is NOT an example of a flat file?

□ \square A .csv file.
□ \square A tab-delimited .txt.
■ \blacksquare A relational database (e.g. PostgreSQL).

1.7 Pop quiz: what exactly are flat files?

Which of the following statements about flat files is incorrect?

□ \square Flat files consist of rows and each row is called a record.

■ \blacksquare Flat files consist of multiple tables with structured relationships between the tables.

□ \square A record in a flat file is composed of fields or attributes, each of which contains at most one item of information.

□ \square Flat files are pervasive in data science.

1.8 Why we like flat files and the Zen of Python

1.9 Importing flat files using NumPy

1.10 Using NumPy to import flat files

1.11 Customizing your NumPy import

1.12 Importing different datatypes

1.13 Working with mixed datatypes (1)

1.14 Working with mixed datatypes (2)

1.15 Importing flat files using pandas

1.16 Using pandas to import flat files as DataFrames (1)

1.17 Using pandas to import flat files as DataFrames (2)

1.18 Customizing your pandas import

1.19 Final thoughts on data import

2. Importing data from other file types

2.1 Introduction to other file types

2.2 Not so flat any more

2.3 Loading a pickled files

2.4 Listing sheets in Excel files

2.5 Importing sheets from Excel files

2.6 Customizing your spreadsheet import

2.7 Importing SAS/Stata files using pandas

2.8 How to import SAS7BDAT

2.9 Importing SAS files

2.10 Using read_stat to import Stata files

2.11 Importing Stata files

2.12 Importing HDF5 files

2.13 Using File to import HDF5 files

2.14 Using h5py to import HDF5 file

2.15 Importing MATLAB files

2.16 Loading .mat files

2.17 The structure of .mat in Python

3. Working with relational databases in Python

3.1 Introduction to relational databases

3.2 Pop quiz: The relational model

3.3 Creating a database engine in Python

3.4 Creating a database engine

3.5 What are the tables in the database?

3.6 Querying relational databases in Python

3.7 The Hello World of SQL Queries!

3.8 Customizing the Hello World of SQL Queries

3.9 Filtering your database records using SQL’s WHERE

3.10 Ordering your SQL records with ORDER BY

3.11 Querying relational databases directly with pandas

3.12 Pandas and The Hello World of SQL Queries!

3.13 Pandas for more complex querying

3.14 Advanced querying: exploiting table relationships

3.15 The power of SQL lies in relationships between tables: INNER JOIN

3.16 Filter your INNER JOIN

3.17 Final Thoughts

 类似资料: