Categories: Analytics, Case Studies, Implementation, | Sun, 21 Jun 2020 14:30:00 GMT
Blogs > https://www.sociallyconstructed.online/blogs/week-4-putting-code-to-pixel
Series Blog Contents:
Check here for all published blogs.
Announcement: Ria’s Journey begins!
Week 1: Ria’s 1st week
Week 2: The SCMS in Airtable
Week 3: Preparations & Superheroes
Week 4: Putting Code to PIxel
Week 5: The SCMS Data’s Alive!
Week 6: Airtable to Google Sheets
Week 7: Our 1st Visualizations
Putting Code to Pixel; My 1st week coding
So this past monday I put on my enthusiastic boots and set off to work.
I first understood which data attributes need to be extracted to maintain consistency in the output data. After this, I made a new enricher for extracting Github data for the Social Currency Metrics System. I have collected the comments made under a Github issue and under a Pull request.
We decided to keep the context for each GitHub comment, to bring more clarity and reduce idea redundancy to the table. So, I retrieved the ‘Title’ of a Pull request or ‘Title’ of an issue as the ‘context’. This is because the Title of the issue/PR conveys the same message as in the following comment. Similar steps were done while extracting mails from the mailing lists. The ‘subject’ of the mail was set as ‘context’ in this case.
Now, after performing an ElasticDump operation, I wrote a script called “ES2Excel” which converts data from Elastic Search indexes into a CSV file, further into an Excel format, and then further extending to an Airtable view.
Now, the data obtained by mails (MBox) and comments (Github) needed to be collected together in one excel sheet. So, for this, we perform “Aliasing” on ES indexes. We have 2 enriched indexes which we need to alias as a third index which can be used for creating CSV and Excel files.
The CSV sheet is then converted to an Airtable view using Airtable API. We execute ES2Excel script mentioned above on the aliased index. This Airtable view is ready for performing all tagging procedures to it. The output can be seen below.
Where we are and where we’re going
It was fun to relate those theories with the Social Currency Metric System. Dylan and Venia also emphasized understanding the process of Community interactions and further bringing it to a codable form. My mentors suggested that we must have a third data source other than Github and mailing lists, to avoid biased data, so we’ll also be looking towards the addition of Twitter or IRC data in the next week.
For the most part next week, I’ll be focusing on converting the randomly tagged text data to ElasticSearch with the help of ‘Study’. The work done this week was nicely aligned with the timeline. Looking forward to more learning sessions! ?