Re: Chronicle Of A Data Scientist/analyst by Dthinkerman: 7:15am On Jun 05, 2020 |
hammerpP:
lala, please help us ask the tribalistic moderator in Politics section wat this thread did to him.
He is the same mod that will turn a blind eye to thread abusing Igbo people, yet very quick to hide a thread discussing oil in Igboland.
Help us ask him wat this thread did to him. cc seun https://www.nairaland.com/5903415/no-oil-south-east#90309951 Please, stop using the liberty that Lala replied our messages here and pushed one of our own works to the Front Page to keep quoting and asking him to do other things. If he feels something is good enough, he will do it, as pointed earlier. I don't mean to be harsh, but the act may appear like opportunity abuse, which he may not even heed to, anyway. |
Re: Chronicle Of A Data Scientist/analyst by ibromodzi: 7:18am On Jun 05, 2020 |
Graspad:
Do you mean from scratch?? Not necessarily.. |
Re: Chronicle Of A Data Scientist/analyst by Dthinkerman: 7:24am On Jun 05, 2020 |
Georgry: Please someone should advise me, I want to learn programming from scratch and I've ordered for a laptop but I don't know if this laptop can do the job as it looks like a low end system.
Dell latitude e5500 Hdd 320gb Ram 4gb Graphics card 518mb Screen size is 15 Intel core 2 duo No webcam... Please its urgent because I'm getting the system this morning...
Cc. Graspad Ejiod Elunico I highly doubt this particular laptop has a standalone graphics card. If it had any like claimed, then it would be a video RAM partitioned to work with the main RAM, as usually designed by Dell. Equally, It did not take a Google search to tell that it's old, and if you are spending alot to get it, I will recommend you get a more recent one, that will be be huge in the departments of RAM, storage, and clockrate, as against the supposedly present "video graphics". 2 Likes |
Re: Chronicle Of A Data Scientist/analyst by Ejiod(m): 7:35am On Jun 05, 2020 |
Georgry: Please someone should advise me, I want to learn programming from scratch and I've ordered for a laptop but I don't know if this laptop can do the job as it looks like a low end system.
Dell latitude e5500 Hdd 320gb Ram 4gb Graphics card 518mb Screen size is 15 Intel core 2 duo No webcam... Please its urgent because I'm getting the system this morning...
Cc. Graspad Ejiod Elunico That’s why we have Google Colab. Just google Google Colab and start your data science there. 2 Likes 1 Share |
Re: Chronicle Of A Data Scientist/analyst by Hardheolar(m): 8:41am On Jun 05, 2020 |
hammerpP:
lala, please help us ask the tribalistic moderator in Politics section wat this thread did to him.
He is the same mod that will turn a blind eye to thread abusing Igbo people, yet very quick to hide a thread discussing oil in Igboland.
Help us ask him wat this thread did to him. cc seun https://www.nairaland.com/5903415/no-oil-south-east#90309951 Not to sound condescending but what you just did is appalling which needs to be condemned by all. |
Re: Chronicle Of A Data Scientist/analyst by Georgry(m): 8:48am On Jun 05, 2020 |
Dthinkerman:
I highly doubt this particular laptop has a standalone graphics card. If it had any like claimed, then it would be a video RAM partitioned to work with the main RAM, as usually designed by Dell.
Equally, It did not take a Google search to tell that it's old, and if you are spending alot to get it, I will recommend you get a more recent one, that will be be huge in the departments of RAM, storage, and clockrate, as against the supposedly present "video graphics". Thanks so much bro, do you think 40 thousand naira will be too much for this kind of computer? And can I get any recent UK used laptop in that kind of price range |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 10:08am On Jun 05, 2020 |
elunico:
Great!! I'm also an engineer, I haven't much knowledge in programming, in fact, I just started learning HTML. How has your limited knowledge of statistics played a role in your progress.
I'm good with calculations thouh. I actually love the statistics. I'm building my knowledge on inferential statistics. It's beautiful. I think it's not been very difficult generally for me to understand most of the topics. The difficult things are the maths associated with PCA, Gaussian processes (and related...). I'm familiar with linear algebra however, so I believe in a while I'll be good with these topics. 1 Like |
Re: Chronicle Of A Data Scientist/analyst by Hardheolar(m): 1:07pm On Jun 05, 2020 |
cochtrane: As a budding data scientist who visits NL often, it's not surprising that you start to get more than interested in the some of the topics making front page and how frequent topics from individual sections reach the top. I have been looking into this for a while and thought it would be nice to do some investigation in this regard. For example, which section makes front page most often? How often do we see programming topics get to the front page? Who posts more often on the front page? Is it really lalasticlala, as is frequently supposed, or is it someone else? What exactly has been the relationship between lalasticalala and snakes over the past year? Some people think he loves to push snake topics to the frontpage more often than other topics. What else can we learn from the topics making frontpage? Like for example, are they mostly about Buhari or something else?
Anyways, getting your hands dirty with a data set is always a good way to learn data analysis. If you need help with navigating this, you can buzz me.
The data in the link you pasted is showing 3rd of June 2019 to 2nd of June, 2020 which defers from your stated starting date. |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 1:09pm On Jun 05, 2020 |
Hardheolar:
The data in the link you pasted is showing 3rd of June 2019 to 2nd of June, 2020 which defers from your stated starting date. Just looked at the csv now. This is where it stops. That's 31st of May: 2019-05-31 15:38:00 1 Like |
Re: Chronicle Of A Data Scientist/analyst by elunico: 2:54pm On Jun 05, 2020 |
cochtrane:
I actually love the statistics. I'm building my knowledge on inferential statistics. It's beautiful. I think it's not been very difficult generally for me to understand most of the topics. The difficult things are the maths associated with PCA, Gaussian processes (and related...). I'm familiar with linear algebra however, so I believe in a while I'll be good with these topics. Yeah, your background in engineering wouldn't see the associated math much as a hurdle to surmount.
Please, I'd like to know the areas of statistics I'd need to arm myself with in order to obtain proficiency in the Data Scientist field. Materials you have found helpful too. |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 3:44pm On Jun 05, 2020 |
elunico:
Yeah, your background in engineering wouldn't see the associated math much as a hurdle to surmount.
Please, I'd like to know the areas of statistics I'd need to arm myself with in order to obtain proficiency in the Data Scientist field. Materials you have found helpful too. probability distributions, inferential statistics, of course descriptive statistics. I feel like descriptive statistics is what you wanna focus more on, cos it's foundational. And naturally, Bayesian statistics. Check out R for applied statistics, if you use R. Or the one here: faculty(dot)marshall(dot)usc(dot)edu(forward_slash)gareth(hyphen)james(forward_slash)ISL(forward_slash) 6 Likes |
Re: Chronicle Of A Data Scientist/analyst by Hardheolar(m): 5:06pm On Jun 05, 2020 |
cochtrane:
Just looked at the csv now. This is where it stops. That's 31st of May:
2019-05-31 15:38:00 Was actually looking at the date it hit front page. Btw, the link is no longer accessible |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 5:22pm On Jun 05, 2020 |
Hardheolar:
Was actually looking at the date it hit front page. Btw, the link is no longer accessible Did some reorganization. Here it is. I'll be uploading some more scraped data which includes details of each thread, including user, sex and number of posts. |
Re: Chronicle Of A Data Scientist/analyst by Hardheolar(m): 6:03pm On Jun 05, 2020 |
Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts. Few insights from the dataset -dre11 is the king of political threads with 361 threads followed by Islie(329) and ijustdey(225) during the time captured. Lala is more interested in celebrity threads compared to politics, followed by Alex. Ogbiwa is the defending champion of sports threads. -Threads are mostly pushed to front page in the morning, which is reasonable since that is when the day begins. -There was spike in threads that were pushed to front page in July 2019, but I can't tell if that is the norm during that period since we don't have previous year's data to make the comparison. -You will think that threads in religion section should top the threads that make it to front page on Sundays, but it came third after politics and celebrity threads. - 853 threads made it to front page with "kill" keyword. That is worrisome and a cause of concern. e.tc. Lots of insight can be derived from the data. Below is a link to the report. It is interactive, so feel free to play around with it. https:///3gXsRlq 3 Likes |
Re: Chronicle Of A Data Scientist/analyst by mbhs139(m): 7:03pm On Jun 05, 2020 |
I think this is another good site for data, it is the Nigeria's Federal Government Daily Spending Report. Source is tweeter https://opentreasury.gov.ng/index.php/component/content/article/11-dpr/3015-2020-daily-payment?Itemid=101I tried to send the books I promised, especially to Amoo Segun (or is it Samuel) who dropped his email, but the stuff keeps bouncing back. It's not something difficult though, just that I'm distracted sort of. I will do by God's grace once I clear my head. In the meantime, check the above link for real time financial data. 8 Likes 1 Share |
Re: Chronicle Of A Data Scientist/analyst by mbhs139(m): 7:04pm On Jun 05, 2020 |
Hardheolar: Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts.
Few insights from the dataset -dre11 is the king of political threads with 361 threads followed by Islie(329) and ijustdey(225) during the time captured. Lala is more interested in celebrity threads compared to politics, followed by Alex. Ogbiwa is the defending champion of sports threads. -Threads are mostly pushed to front page in the morning, which is reasonable since that is when the day begins. -There was spike in threads that were pushed to front page in July 2019, but I can't tell if that is the norm during that period since we don't have previous year's data to make the comparison. -You will find that threads in religion section top the threads that make it to front page on Sundays, but it came third after politics and celebrity threads. -853 threads made it to front page with "kill" keyword. That is worrisome and a cause of concern. e.tc. Lots of insight can be derived from the data. Below is a link to the report. It is interactive, so feel free to play around with it.
https:///3gXsRlq This is a fantastic job. I love this. Kudos! 2 Likes |
Re: Chronicle Of A Data Scientist/analyst by iCode2: 7:28pm On Jun 05, 2020 |
Hardheolar: Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts.
Few insights from the dataset -dre11 is the king of political threads with 361 threads followed by Islie(329) and ijustdey(225) during the time captured. Lala is more interested in celebrity threads compared to politics, followed by Alex. Ogbiwa is the defending champion of sports threads. -Threads are mostly pushed to front page in the morning, which is reasonable since that is when the day begins. -There was spike in threads that were pushed to front page in July 2019, but I can't tell if that is the norm during that period since we don't have previous year's data to make the comparison. -You will find that threads in religion section top the threads that make it to front page on Sundays, but it came third after politics and celebrity threads. -853 threads made it to front page with "kill" keyword. That is worrisome and a cause of concern. e.tc. Lots of insight can be derived from the data. Below is a link to the report. It is interactive, so feel free to play around with it.
https:///3gXsRlq I wish I can do this already! 3 Likes |
Re: Chronicle Of A Data Scientist/analyst by brashear: 7:33pm On Jun 05, 2020 |
Hardheolar: Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts.
Few insights from the dataset -dre11 is the king of political threads with 361 threads followed by Islie(329) and ijustdey(225) during the time captured. Lala is more interested in celebrity threads compared to politics, followed by Alex. Ogbiwa is the defending champion of sports threads. -Threads are mostly pushed to front page in the morning, which is reasonable since that is when the day begins. -There was spike in threads that were pushed to front page in July 2019, but I can't tell if that is the norm during that period since we don't have previous year's data to make the comparison. -You will find that threads in religion section top the threads that make it to front page on Sundays, but it came third after politics and celebrity threads. -853 threads made it to front page with "kill" keyword. That is worrisome and a cause of concern. e.tc. Lots of insight can be derived from the data. Below is a link to the report. It is interactive, so feel free to play around with it.
https:///3gXsRlq Impressive! 2 Likes |
Re: Chronicle Of A Data Scientist/analyst by mbhs139(m): 7:50pm On Jun 05, 2020 |
iCode2: I wish I can do this already! It's a matter of time. Keep on trying, keep on pushing, you'll get there. 1 Like |
Re: Chronicle Of A Data Scientist/analyst by Dthinkerman: 8:52pm On Jun 05, 2020 |
Georgry:
Thanks so much bro, do you think 40 thousand naira will be too much for this kind of computer? And can I get any recent UK used laptop in that kind of price range Sorry for the late response. I think you should look around and compare lots of options before settling for any. The budget is actually low, given the hike in prices of things in these trying times. 1 Like |
Re: Chronicle Of A Data Scientist/analyst by Dthinkerman: 8:54pm On Jun 05, 2020 |
mbhs139:
It's a matter of time. Keep on trying, keep on pushing, you'll get there. Hello, I did quote you in the past, requesting a tutorial video. Could you kindly check, and assist me with it. |
Re: Chronicle Of A Data Scientist/analyst by Hardheolar(m): 9:51pm On Jun 05, 2020 |
iCode2: I wish I can do this already! Persistency is key. Just keep pushing. I also feel that way when I see some reports. |
Re: Chronicle Of A Data Scientist/analyst by iCode2: 11:44pm On Jun 05, 2020 |
mbhs139:
It's a matter of time. Keep on trying, keep on pushing, you'll get there. Hardheolar:
Persistency is key. Just keep pushing. I also feel that way when I see some reports. Thanks guys. 1 Like |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 12:47am On Jun 06, 2020 |
Hardheolar: Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts.
Few insights from the dataset -dre11 is the king of political threads with 361 threads followed by Islie(329) and ijustdey(225) during the time captured. Lala is more interested in celebrity threads compared to politics, followed by Alex. Ogbiwa is the defending champion of sports threads. -Threads are mostly pushed to front page in the morning, which is reasonable since that is when the day begins. -There was spike in threads that were pushed to front page in July 2019, but I can't tell if that is the norm during that period since we don't have previous year's data to make the comparison. -You will think that threads in religion section should top the threads that make it to front page on Sundays, but it came third after politics and celebrity threads. -853 threads made it to front page with "kill" keyword. That is worrisome and a cause of concern. e.tc. Lots of insight can be derived from the data.
Brilliant! That monthly analysis thing is quite revealing. I also wonder if there's any insight deriveable from the influence of coronavirus on hours/days of Frontpage posts. |
Re: Chronicle Of A Data Scientist/analyst by elunico: 6:02am On Jun 06, 2020 |
cochtrane:
probability distributions, inferential statistics, of course descriptive statistics. I feel like descriptive statistics is what you wanna focus more on, cos it's foundational. And naturally, Bayesian statistics. Check out R for applied statistics, if you use R. Or the one here: faculty(dot)marshall(dot)usc(dot)edu(forward_slash)gareth(hyphen)james(forward_slash)ISL(forward_slash) Thanks for this.
I'd search the internet for materials on descriptive statistics and Bayesian sat as you've suggested and would get to them when my study requires me to get the knowledge.
I think Imma stick with Python, not R. |
Re: Chronicle Of A Data Scientist/analyst by Dum20: 7:31am On Jun 06, 2020 |
Hardheolar: Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts.
Few insights from the dataset -dre11 is the king of political threads with 361 threads followed by Islie(329) and ijustdey(225) during the time captured. Lala is more interested in celebrity threads compared to politics, followed by Alex. Ogbiwa is the defending champion of sports threads. -Threads are mostly pushed to front page in the morning, which is reasonable since that is when the day begins. -There was spike in threads that were pushed to front page in July 2019, but I can't tell if that is the norm during that period since we don't have previous year's data to make the comparison. -You will think that threads in religion section should top the threads that make it to front page on Sundays, but it came third after politics and celebrity threads. -853 threads made it to front page with "kill" keyword. That is worrisome and a cause of concern. e.tc. Lots of insight can be derived from the data. Below is a link to the report. It is interactive, so feel free to play around with it.
https:///3gXsRlq Wow this is fantastic. Very smart analysis. I am impressed. So bro for those of us coming behind, can you let us know: 1. How long you have been learning programming and datascience 2. What courses did you take to bring you to this level 3. Any other words of encouragement 1 Like |
Re: Chronicle Of A Data Scientist/analyst by deedat205(m): 7:43am On Jun 06, 2020 |
Hardheolar: Hi guys, I did further digging on the dataset that cochtrane scraped using a business intelligence tool called Power BI. Note: The analysis is for the date when the thread made it to front page, not when it was created which spanned from 3rd of June 2019 to 2nd of June 2020 . The data has 28,516 threads from 38 different sections by 5025 different accounts............... https:///3gXsRlq Impressive 1 Like |
Re: Chronicle Of A Data Scientist/analyst by Hardheolar(m): 9:07am On Jun 06, 2020 |
cochtrane:
Brilliant! That monthly analysis thing is quite revealing. I also wonder if there's any insight deriveable from the influence of coronavirus on hours/days of Frontpage posts. The advent of coronavirus in March increased the number of posts hitting front page from health section as seen in the image attached below. 1,617 threads on coronavirus have made it to front page. Instead of having the months start from January to December which might be confusing, I rearranged the months to show when the the data started from, which was June 2019. There has been a huge decline in the number of posts that made it to front page since December 2019 which slightly increased by 17% in March 2020 due to Covid. The questions that needs to be answered are: - was there a change in NL policy regarding the number of posts that hits front page? - or the change in moderators during that period? 3 Likes |
|
Re: Chronicle Of A Data Scientist/analyst by Zabiboy: 10:02am On Jun 06, 2020 |
Dum20:
Wow this is fantastic. Very smart analysis. I am impressed.
So bro for those of us coming behind, can you let us know: 1. How long you have been learning programming and datascience 2. What courses did you take to bring you to this level 3. Any other words of encouragement
I know i'm not the one this question is directed to but the truth is Power Bi and Tableau are quite straifht-forward and not difficult to learn... Almost everything is been done already.. The only kind of difficult stuff in Power Bi is the DAX function, which is Calculated field in Tableau... Generally, You can use 2-3 weeks to learn either of them, and with constant practice, in 1 or 2 months, you should be a Pro.. Like i said, in both of them, most of the work has already been done for you by the developers.. The only hindrance is that neither of them can be used to web-scrape data GL
5 Likes |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 11:19am On Jun 06, 2020 |
Hardheolar:
The advent of coronavirus in March increased the number of posts hitting front page from health section as seen in the image attached below. 1,617 threads on coronavirus have made it to front page.
Instead of having the months start from January to December which might be confusing, I rearranged the months to show when the the data started from, which was June 2019. There has been a huge decline in the number of posts that made it to front page since December 2019 which slightly increased by 17% in March 2020 due to Covid.
The questions that needs to be answered are: - was there a change in NL policy regarding the number of posts that hits front page? - or the change in moderators during that period? This is good. Not unexpectedly, we saw lots more posts from health section. I'm guessing you are onto something there regarding a factor directly causing the decline in number of posts hitting frontpage from around November. I'd say that decline is statistically significant. That's what I though originally, before going into R to check. And it's kinda true. The mean decline in post between September and October wasn't statistically significant. But once we got to November, the mean decline became statistically significant between November and the previous month. Same results between December and the previous month. This is inferential statistics. The density plots show why this may be the case since the peaks are aligned differently. But one wouldn't judge only visually, unless you conduct a two-sample t-test like below. t-test between September and October values: p-value is quite high and not less than 0.05 > t.test(t.sep, t.oct)
Welch Two Sample t-test
data: t.sep and t.oct t = 0.012042, df = 56.259, p-value = 0.9904 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -13.86717 14.03492 sample estimates: mean of x mean of y 100.6000 100.5161 Decline between November and October shows some statistical significance (pvalue <0.05) > t.test(t.nov, t.oct)
Welch Two Sample t-test
data: t.nov and t.oct t = -2.7617, df = 56.929, p-value = 0.007727 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -32.57471 -5.19088 sample estimates: mean of x mean of y 81.63333 100.51613 |
Re: Chronicle Of A Data Scientist/analyst by cochtrane(m): 11:20am On Jun 06, 2020 |
maybe the mods have an answer why there was such decline. There likely is a causal factor, as the statistics show it's not due to chance. |