Monday 24 September 2018

Data accumulation/cleaning is like foundation and reinforcement in home building for data analysis

Yahoo finance API used to do everything for us, however it has been closed off, so somebody like me have to find another way out to get access to those financial data before doing analysis.


I did some work to accumulate data myself to have a data source for analysis:
https://github.com/chenlocus/aushare

However, I found this is far from enough.  ASX has a lot of codes which were delisted but still in their website for people to download:
https://www.asx.com.au/asx/research/ASXListedCompanies.csv


First of all, I need to sort out those delisted symbols from the list above, then I need to remove those invalid data such as 'null', 'n/a', '-' in the data I crawled, thanks to pandas, doing a lot of work and make the job much easier.


I need to amass those balance sheet, income report, cash flow, as well as such information as market capitals, P/E, Dividend from websites and store them in various files. These data are to be cleaned thus provide a strong foundation and stump for my data analysis work.


I feel so excited that I can use these data analysis tools and techniques to re-play and prove my opinions and philosophies  in stock market investment from year 1997 since I was a university student.


These are something I figure out when Yahoo finance API was available, now I have to do these by building an API by myself.  I think a real investor can understand the importance of price you buy in. Charts like these can provide benchmark for evaluation of share price in different period.






No comments:

Post a Comment