Trend of Mobility Changes due to Pandemic

Tableau Dashboard of Mobility Changes

How Indonesians are moving around differently due to COVID-19 Pandemic? How their mobility trends changed for trip to workplaces, groceries and pharmacy, restaurants, parks, transit stations, or just staying at home?

Well, under the #covid19 pandemic, it’s time for us to work from home, study from home and worship at home. That advice was announced by Pak Jokowi for Indonesians (esp. Jabodetabek) on March 15, 2020. Look at this ‘spaghetti’ graph with the data sourced from Google Community Mobility Report below. Start from 15 March 2020, around +15% mobility trend changes for places of residences. (Note: mobility changes means two parameters how visits and length of stay at the places.) Around -15% mobility trend changes for places of work.

Continue reading “Trend of Mobility Changes due to Pandemic”

Sumber Pembelajaran Data Science

Berbahagialah kalian yang hidup di masa internet yang serba cepat sekarang ini. Sumber belajar dapat diperolah dari mana-mana, mulai dari buku, video, podcast, hingga kelas online. Saya rangkum beberapa sumber yang mudah-mudahan bisa menjadi inspirasi kalian belajar. Sumber dari github saya di sini.

Inilah yang menjadi sumber inspirasi saya dalam membuat Kelas Online Data Science Foundation di IYKRA.

Continue reading “Sumber Pembelajaran Data Science”

Bagaimana Pola Perubahan Mobilitas Masyarakat sebelum dan selama Pandemi COVID-19?

Beberapa hari lagi, tepatnya tanggal 24 April 2020, saya akan membawakan materi di webinar IYKRA. Temanya adalah Analisis Perubahan Mobilitas Masyarakat selama Pandemi COVID-19.

Berdasarkan Data PHEOC Kemkes, per 13 April 2020 Indonesia sudah tercatat sedikitnya 4.557 kasus terkonfirmasi COVID-19 dengan 399 kasus meninggal dan 380 kasus sembuh. Diperkirakan akan tetap naik lagi dari hari ke hari. Ada yang memprediksi eksponensial seperti di Amerika Serikat, atau Sigmoid seperti di Tiongkok.

Continue reading “Bagaimana Pola Perubahan Mobilitas Masyarakat sebelum dan selama Pandemi COVID-19?”

Menjadi orang yang lebih mudah disukai

Di artikel kali ini aku akan bagi-bagi tips gimana caranya menjadi seseorang yang lebih mudah disukai. Satu hal yang pasti, orang yang disukai itu adalah orang yang menyenangkan. Orang yang jujur, tidak palsu, dan tidak dahaga akan perhatian. Tips ini akan berguna banget buat semua orang, terutama pebisnis dan pekerja di bidang data.

Biasanya, pebisnis atau pekerja mengenal dikotomi atasan, bawahan, dan peers (termasuk klien). Menurut orang bijak, life is not only about money, but also relationship. So… mari kita simak.

Continue reading “Menjadi orang yang lebih mudah disukai”

Sharing Gimana cara menjadi seorang Data Scientist Junior?

Beberapa hari yang lalu, bertepatan dengan Hari Ibu (versi Indonesia), tanggal 22 Desember 2018, aku diberikan kepercayaan oleh Mas Dian, owner babastudio.com untuk menjadi salah satu pembicara dalam acara “Learning From The Master”, di kampusnya di ruko kompleks ITC Permata Hijau Jaksel. Bagi yang belum tahu Babastudio itu apa, babastudio adalah school of website and consulting. Menyediakan jasa pembuatan website dan pelatihan ke berbagai lembaga seperti pemerintahan maupun swasta.

Continue reading “Sharing Gimana cara menjadi seorang Data Scientist Junior?”

A Start Guide to Play with Data

Start Guide to play with Data

A year ago, I was asked by one of my leader to convince him why a data scientist must use a high-end laptops. It is not a difficult answer if you have been working with data for about 1-3 months with a low-end spec laptop.

Well, the biggest painful work for data scientist is about data preparation, and it costs a lot of memory and storage.

What if your data big? What if you want to create an excel file with hundred thousands of row? What if you need to install softwares that are restricted by the IT team?

You need a workstation or a laptop with at least 16GB RAM and 1 TB Storage.
While, still … you still need a server!

Here the detail is. I sent this slide to my leader. He then provide me with the laptop that I need.

Just in case you need the similar justification, please download my slide here: A Start Guide to Play with Data.

Sharing Mobility Data Insights at Tech In Asia Conference 2018 Jakarta

You may see my previous post on this link: “Menimba ilmu di Tech in Asia (TIA) PDC 2017“. I didn’t actively contribute at that time. I just became an audience. Became someone who just listened to what speakers trying to provide.

But the day came, one year after, on October 24, 2018. I was trusted to become one of the speakers. I gave big Thanks to my leader: Mia Melinda, who trusted me to become a TIA speaker. The speaker means someone who give values to others. Someone who have to be responsible to not wasting 30 mins time of the super-busy audience. The audience who had spent time and money more than 1 mio rupiahs for just entering this 2 days-event.
Continue reading “Sharing Mobility Data Insights at Tech In Asia Conference 2018 Jakarta”

Kembali Mengajar, karena Diundang

Tahun 2018 sudah memasuki bulan Maret. Tak terasa. Jakarta masih saja panas, sehingga membuat sebagian besar penduduknya harus menyamankan diri dengan Air Conditioner (AC).  Tak terkecuali saya. Sebagai catatan pada saat saya menuliskan postingan ini, AC kamar tidur bocor dan belum sempat dibersihin karena tukangnya sibuk. Sudah hampir sebulan kejadian ini berlangsung, sehingga harus kita tampung dengan ember. lol. Dan pagi dini hari ini ceritanya epik, posisi embernya nggak pas… alhasil lantai basah dan mesti nge-pel di tengah tidur nyenyak ku.

Back to the topic about work.

Sejak bulan kedelapan di tahun yang lalu, saya mulai meniti karya di tempat yang baru. Artinya sudah tujuh bulan saya berkarya di tempat tersebut. Tidak mudah memang menjalaninya terutama dari sisi birokrasi yang terkesan kaku dan rempong dengan tetek bengeknya, per-HR-an yang tidak sesuai ekspektasi, dan beberapa alat penunjang kerja yang awalnya tidak memuaskan.

Tapi saya akan bertahan karena banyak positifnya. Orang-orang setim yang mau maju, mau memperbaiki, dan saling mendukung. Sepertinya Tuhan menjawab doa saya, mengingat di tahun 2017 saya tiga kali pindah kerja. Ini dia quote yang membuat saya kuat.

Saya percaya Tuhan pasti selalu menunjukkan jalan bagi orang yang percaya. Ibarat Musa ditunjukkan jalan oleh Tuhan untuk membebaskan kaum Israel dari perbudakan Mesir, demikian pula Tuhan menunjukkan saya jalan terbaik.

Kembali diminta mengajar.

Awal tahun 2017 ini saya mulai diminta untuk mengajar setelah rehat kira-kira hampir 9 bulan. Topiknya ya nggak jauh-jauh dari kerjaan tentang Data, R, SQL, Machine Learning, dan sebangsanya. Thank You IYKRA (Fajar dan Zizah) atas undangannya. Awalnya agak sulit, tapi kalau dijalani, ya OK juga. Saya sendiri merasakan dampak positif yang luar biasa, karena saya “dipaksa” untuk belajar, karena memang hal ini wajib dilakukan sebagai persiapan sebelum mengajar.

Selama dua bulan terakhir ini, saya sudah mengajar selama 4x. Lumayan banyak ya… Dan nanti bakalan ada lagi bulan April dan Mei. Terus terang saya senang mengingat hasilnya bisa dipakai untuk ganti dan pasang AC. 🙂 But, lebih dari itu, saya merasa senang kalau orang lain merasakan manfaat dari apa yang saya bagikan.

I believe I don’t have to wait until I am reach to share with others.

– BAK, 2018 –

Pertama, saya ngajar workshop di future force fair. Itu dihadiri oleh 180 peserta. Formatnya workshop tentang R. Anda bisa lihat materinya gratis tanpa dipungut pajak ataupun se-sen rupiah pun di sini. (Anda tinggal klik kanan, lalu “save Link as” atau “save target as”.

Ini beberapa foto saya ketika mengajar. Guanteng yo?

Kedua, saya ngajar ggplot2 dan dplyr. Materinya bisa didownload di sini.

Ketiga, saya ngajar sql for data analisis. Materinya bisa didownload di sini.

Keempat, saya ngajar advanced sql. Materinya bisa didownload di sini.

There will be more interesting story to tell when I teach and give speech. So, stay tuned!

 

6 Top Big Data Use Cases done by Telkom Indonesia

Big Data Week Jakarta Stage
Big Data Week Jakarta Stage

I attended Big Data Week Jakarta event on March 23, 3017 as a participant, came along with Hamid, Amir, and Ramdisa (Stream Intelligence‘s buddies). One of the speaker is Komang Aryasa from Telkom Indonesia. I couldn’t agree more that his presentation content is very good, because he shared about big data use cases that they have done by big analytics team that he is now lead. Something that we cannot get from a text book!

Big Data in Telco
Big Data in Telco

Well, what are they?

  1. Customer Problem Reporting (especially about telco network). Imagine that all of the telco elements are human. If a human is sick, he will pop up the symptoms. Same as the telco elements, they will pop up the alarms. In this use case, his team provides visualizations and failure prediction of those telco elements. So the maintenance team can respond to the problem quickly before the telco elements are ‘dead’ – (read: malfunction).
  2. Decreasing Churn Rate (by regions). He said that his team have built 62 churn model. The models were split by region. Why? Surprisingly, North Jakarta customers are more price sensitive than South Jakarta. Meaning, if the price is increased, they are more likely to switch to another operator or cut the subscription. Conversely, South Jakarta customers are more problem sensitive than price sensitive, meaning saying if there are outage for several hours, they are less likely stay on the subscription. Shortly said that this is the way of Telkom Indonesia understand customers by location. Interesting!
  3. Effective Collection Caring. Have you ever bugged by a customer service call while you are on a meeting? Very annoying right? He said that his team were able to built predictive models, what is the best time for a customer care service calling a certain customer. Have you thought how to do that? Yes! One of them is by looking at their behavior of browsing. For example if the browsing behavior at 8pm changed to news channel, meaning that the parents (e.g. Dad) just arrived home and he accessed to the news. Well, it should be the best time for the customer service contact him. So, he will be able to continue the conversation, meaning that the customer information collection is effective.
  4. Waste Management and Crime Reporting. He mentioned that this use case is a partnership with Bandung Government. The IoT devices (GPS trackers) are installed on the waste truck, and Bandung Smart City controller can track in real time. While on crime reporting, it’s like a panic button system that can be used by Bandung citizen to report problems like heavy traffic, road damage, etc.
  5. Value Chain Transparency with Digital Tools & Empower Farmers with microloans, Banks, Bulog, BUMDs, DukCapil, etc.
  6. Tourism dashboards, built for Indonesian governments to track tourism spots visited by tourists in neighboring countries like Malaysia, Thailand, and Singapore, by utilizing real-time dashboards. The objective of this tracking is to attract 20 million foreign tourists come to Indonesia in one year.

This is a short brief of Komang Ardyasa. He is Deputy Research of Big Data at PT Telkom Indonesia. His organization serve internal (Telkom) and external clients. His partner, Cloudera, is in charged in Telkom’s big data infrastructure and architecture, so before his speech, Cloudera Country Manager, Fred Groen shared a short brief about his company.

 

 

The comparison between randomForest and ranger

Forests
Forest. Source: Here

A Couple days ago I had a chance to be a speaker on internal data scientist meeting at the company that I work for: Stream Intelligence. The meeting is usually held on monthly basis, and the last meeting in October was 6th meeting. We used Skype for Business to connect between the Data Scientists in Jakarta and in London.

I delivered a topic titled Random forest in R: A case study of a telecommunication company. For those who do not know Random Forest, an Indian guy, Gopal Malakar, had made a video uploaded in Youtube. He elaborated the definition of random forest. First of all, check the video out!

Based on the video, one important thing that you have to remember about random forest is that, it is a collection of trees. It was built by a number of decision trees. Each decision trees is formed by random variables and observations of the training data.

Supposed that we have trained a random forest model, and it was made from 100 decision trees. One test observation was inputted on the model. The decision tree outputs will result 60Y and 40N. Hence the output of random forest model is Y with score or probability 0.6.

OK, let’s practice how to train random forest algorithm for classification in R. I just knew it couple weeks ago from Datacamp course, that there are two random forest packages: 1) randomForest and 2) ranger. They recommend ranger, because it is a lot faster than original randomForest.

To prove it, I have created a script using Sonar dataset and caret package for machine learning, with methods: ranger / rf, and tuneLength=2 (this argument refers to mtry, or number of variables that was used to create trees in random forest). In random Forest, mtry is the hyperparameter that we can tune.

Output of ranger training

Output of random forest training

So, the random forest training with ranger function is 26.75-22.37 = 4.38 seconds or 25% faster than original random forest (Assume we use user time).

However, if I tried to change tuneLength parameter with 5. It reveals that the original randomForest function is faster than ranger. Hmmm… seems that I have to upload a question to stackoverflow or Datacamp experts.