27
loading...
This website collects cookies to deliver better user experience
csv
module
csv
module, why it is preferred over the usual read/write method would become apparent in a bit.# import csv module
import csv
# open the csv file with a context manager
with open('records.csv', 'r') as csv_file:
# using the csv reader function
csv_reader = csv.reader(csv_file)
# loop through the csv_reader iterable object
for line in csv_reader:
# print each line in the reader object
print(line)
['first_name', 'last_name', 'email']
['John', 'Doe', '[email protected]']
['Mary', 'Smith-Robinson', '[email protected]']
['Dave', 'Smith', '[email protected]']
['Jane', 'Stuart', '[email protected]']
csv
module, then a context manager is used to open the CSV file, the CSV file is read into a file object referenced csv_file
, using the open()
function.csv
module's reader()
function, each line in the CSV file is parsed into a reader object, csv_reader
.for
loop and a print()
function would return each line in the CSV file 😊. list
object where each comma-separated field is a list item.# open file with a context manager
with open('records.csv', 'r') as csv_file:
# create reader object
csv_reader = csv.reader(csv_file)
# loop through reader object csv_reader
for line in csv_reader:
# print the field values under the field header email
print(line[2])
for
loop, and within the print()
function, is where the indexing of each list item (line in CSV file) is done. reader
object.# open file with a context manager
with open('records.csv', 'r') as csv_file:
# create reader object
csv_reader = csv.reader(csv_file)
# iterate through the csv_reader once
print(f'Field names: {next(csv_reader)}')
Field names: ['first_name', 'last_name', 'email']
next()
function that keeps getting called on the iterator each time. next()
function is called once, which could be seen as looping or iterating through the reader object once, this returns the first line in the CSV file, which would most of the time be the field names. # open the file to read or get comma separated values or data from
with open('records.csv', 'r') as csv_file:
# create reader object
csv_reader = csv.reader(csv_file)
# open/create the file to write comma separated values to
with open('new_records.csv', 'w') as new_csv_file:
# create writer object
csv_writer = csv.writer(new_csv_file, delimiter='-')
# iterate through the comma separated values of the initially opened file through the reader object
for csv_lines in csv_reader:
# write these values to the new file
csv_writer.writerow(csv_lines)
csv
module's reader object. csv
module's writer()
function, which takes as argument the CSV file object. csv_lines
variable, this variable, is passed to the writer object's method writerow()
, which writes these values into the last opened file (new_records.csv).csv.writer()
function in the previous code block, a second argument was included - delimiter='-'
, which would write the values from the previously opened file into the newly opened file, but each field value would be separated by a hyphen (-) instead of a comma (,). CSV files values are not always separated by a comma, as seen in the second file above, the delimiting character could be arbitrary, commas are mostly used as a convention, and in some cases to improve readability.
csv
module's writer()
function, knew to place field values that contained the delimiting character in double quotes, as seen in the image above. This would have otherwise made the file hard to read or use in a program. csv
module's reader()
and writer()
functions seems like the standard way to handle CSV files, there's a better way to read from and write to CSV files, that improves code readability, and helps explicitly manipulate and parse comma-separated-values, the way to achieve this would be using the csv
module's DictReader()
and DictWriter()
functions for reading from and writing to CSV files respectively. csv
module's DictReader()
function# open the file to be read in a context manager
with open('records.csv', 'r') as csv_file:
# create a DictReader object using the DictReader function
csv_dict_reader = csv.DictReader(csv_file)
# iterate through DictReader object
for line in csv_dict_reader:
# print each line in the CSV file as an OrderedDict object
print(line)
OrderedDict([('first_name', 'John'), ('last_name', 'Doe'), ('email', '[email protected]')])
OrderedDict([('first_name', 'Mary'), ('last_name', 'Smith-Robinson'), ('email', '[email protected]')])
OrderedDict([('first_name', 'Dave'), ('last_name', 'Smith'), ('email', '[email protected]')])
OrderedDict([('first_name', 'Jane'), ('last_name', 'Stuart'), ('email', '[email protected]')])
DictReader()
function is very similar to using the reader()
function as shown in the code block above.DictReader()
function is used in place of the reader()
function, thereby returning a DictReader
object, as opposed to the reader
object of the reader()
function.DictReader
object an OrderedDict
object is returned for each line in the CSV file, as opposed to a list
object from a reader
object.OrderedDict
object returned for each line in the CSV file, it makes it easy to index field values, as it would be indexed by the field headers rather than ambiguous index numbers.# open CSV file in a context manager
with open('records.csv', 'r') as csv_file:
# create a DictReader object
csv_dict_reader = csv.DictReader(csv_file)
# iterate through DictReader object
for line in csv_dict_reader:
# get field values for the email field only
print(line['email'])
# open CSV file to read comma separated values from it
with open('records.csv', 'r') as csv_file:
# create DictReader object using the DictReader function
csv_dict_reader = csv.DictReader(csv_file)
# open new CSV file to write comma separated values into it
with open('new_records.csv', 'w') as new_csv_file:
# create a list of the field names or headers of the field values that would be written to the file
field_names = ['first_name', 'last_name', 'email']
# create a DictWriter object using the DictWriter() function.
# assign the field_names list above to the fieldnames parameter of the function
# pass a tab character as the delimiting character
csv_dict_writer = csv.DictWriter(new_csv_file, fieldnames=field_names, delimiter='\t')
# write the field header into the CSV file
csv_dict_writer.writeheader()
# iterate through the values read from the previous file
for line in csv_dict_reader:
# write the comma separated values to the new CSV file
csv_dict_writer.writerow(line)
DictWriter()
function is very similar to using the writer()
function, significant differences to be noted would be explained.DictWriter()
function, by assigning it to the fieldnames=
parameter. DictWriter()
function, the next line contains a method of the DictWriter
object created in the previous line, the writeheader()
method, this just makes sure that the field headers or field names are included when the comma-separated values are written, field headers are written to the top of the CSV file. csv
module, and further usage of concepts and methods explained should be a walk in the park. csv
module is not advisable, the pandas
library should come in handy in such situations, as it contains functions and objects that are better suited for such tasks.read()
, write()
methods would not be feasible when handling CSV files.