Parsing a pipe-delimited file in Python

I'm trying to parse a pipe-delimited file and pass the values into a list, so that later I can print selective values from the list. The file looks like:

name|age|address|phone|||||||||||..etc 
It has more than 100 columns. 31.6k 22 22 gold badges 109 109 silver badges 132 132 bronze badges asked Apr 11, 2013 at 18:29 321 1 1 gold badge 2 2 silver badges 5 5 bronze badges A good question will have a sample code and any errors you get when trying to run the code. Commented Apr 11, 2013 at 18:38

@jwodder: Whatever the reason, it seems to have worked: this question got two valid answers, while the other one got none and was auto-deleted. Voting to reopen, despite the awful score.

Commented Sep 15, 2014 at 17:04 I am so pleased that the attempt to close this question failed on the second attempt! Commented Dec 7, 2020 at 11:06

4 Answers 4

First, register your dialect:

import csv csv.register_dialect('piper', delimiter='|', quoting=csv.QUOTE_NONE) 

Then, use your dialect on the file:

with open(myfile, "rb") as csvfile: for row in csv.DictReader(csvfile, dialect='piper'): print row['name'] 
31.6k 22 22 gold badges 109 109 silver badges 132 132 bronze badges answered Apr 11, 2013 at 18:44 Spencer Rathbun Spencer Rathbun 14.8k 6 6 gold badges 55 55 silver badges 73 73 bronze badges

I am thankful for the proposed solution but I ran into some minor problems. The error "iterator should return strings, not bytes (did you open the file in text mode?)" was solved by adding encoding='utf-8' to the open() statement.The second problem was solved using mode='r' instead of mode='rb' as given in the solution.

Commented Dec 7, 2020 at 11:28
import pandas as pd pd.read_csv(filename, sep="|") 

This will store the file in a dataframe. For each column, you can apply conditions to select the required values to print. It takes a very short time to execute. I tried with 111,047 rows.

31.6k 22 22 gold badges 109 109 silver badges 132 132 bronze badges answered Feb 1, 2017 at 14:57 320 1 1 gold badge 4 4 silver badges 6 6 bronze badges Perhaps extend the sample code to extract individual values? Commented Mar 7, 2022 at 18:52

If you're parsing a very simple file that won't contain any | characters in the actual field values, you can use split :

fileHandle = open('file', 'r') for line in fileHandle: fields = line.split('|') print(fields[0]) # prints the first fields value print(fields[1]) # prints the second fields value fileHandle.close() 

A more robust way to parse tabular data would be to use the csv library as mentioned in Spencer Rathbun's answer.