Said yes to a request to help do some data analysis for someone, for free.
Turned out the data was 17K files, 77GB of data, and the data is fairly 'dirty' and in a number of different structures of delimited separated data - tab, commas, etc.
Bloody mistake on my part. If I don't know the reliability of a dataset, I am not going to help someone out without looking into it first. I've burnt near 20 hours on it already in the last 5-6 weeks.
I didn't really have time to deal with this amount of data, and the requestor has deadlines.
Advice for the reader: don't volunteer without understanding the task ahead of you.
Turned out the data was 17K files, 77GB of data, and the data is fairly 'dirty' and in a number of different structures of delimited separated data - tab, commas, etc.
Bloody mistake on my part. If I don't know the reliability of a dataset, I am not going to help someone out without looking into it first. I've burnt near 20 hours on it already in the last 5-6 weeks.
I didn't really have time to deal with this amount of data, and the requestor has deadlines.
Advice for the reader: don't volunteer without understanding the task ahead of you.