A few months ago, I obtained my motorcycle license. The next logical step was to choose my first motorcycle. As a newcomer to this world, I thought I could use open data to make an informed decision. I had no idea that this process would lead me to insights about the challenges of accessing and using open data, particularly in journalism.
I started by researching motorcycle types and brands on various websites and forums. During this process, I discovered an intriguing trend in the EU motorcycle market. New brands were emerging, offering seemingly identical products at half the price of established brands. This seemed like an excellent opportunity for buyers. However, a cloud of uncertainty hung over these newcomers’ reliability.
While browsing a motorcycling website, an article caught my attention. It included a table detailing models, brands, and sales figures for 2023. The source was listed as the Greek statistical service (ELSTAT). ELSTAT, short for Hellenic Statistical Authority, is Greece’s national statistical service responsible for producing official statistics.
This discovery sparked an idea.
If ELSTAT provided open data on motorcycle sales, I could use this information to assess the reliability of these new brands. My plan was to compare sales and resale data between new and established brands. The logic was simple: if new brands had a low resale ratio, it could indicate high customer satisfaction and good quality. Fewer people selling their recently purchased motorcycles might suggest they were happy with their choice. Conversely, a high resale ratio might point to dissatisfaction or reliability issues.
Excited by this prospect, I began searching the ELSTAT website. I was looking for specific datasets:
-
- New motorcycle registrations by brand and model
- Used motorcycle sales or transfers by brand and model
- Time series data to observe trends over the past 5 years
However, I quickly ran into obstacles. The website’s menu structure was confusing, making it hard to find relevant data. The search function was equally frustrating. Broad terms like “motorcycle sales” gave too many results. Specific terms like “Honda CB650R registrations” returned nothing at all.
Determined to find the information, I turned to Google. I hoped to locate ELSTAT reports for the years I was interested in. Even after finding some reports, more challenges emerged. The documents didn’t have the model-specific data I needed. Even brand-level information was unclear. Instead, the reports focused on motorcycle-producing factories rather than individual brands or models. After several hours of fruitless searching, I had to give up.
My quest to use open data for motorcycle selection had failed.
This experience left me with several questions about how journalists use and present data: Where did the website get the data for their table? Was I looking for the wrong information due to my lack of expertise in motorcycles and vehicle registration? Did they combine ELSTAT data with other sources, like dealerships or industry reports?
Why didn’t they link to their data sources? Do they not see value in sharing sources? Are they unaware that readers might want to verify or analyze the data themselves? Or do they intentionally withhold source information to maintain a competitive edge in online journalism?
Open data didn’t help me choose a motorcycle. However, this experience validated many challenges our consortium has identified regarding open data accessibility and use. It highlighted issues such as poor data organization, lack of detail in official statistics, and the difficulties non-experts face when navigating complex datasets.
Moreover, it raised new questions about the relationship between journalism and open data. How can we bridge the gap between raw government data and useful public information? How can journalists better serve readers by making data sources more transparent and accessible?
These questions merit further exploration as we continue to advocate for more open and usable public data. While I couldn’t pick a motorcycle using open data, this journey opened my eyes to broader issues in data accessibility and use in journalism. It underscores the need for improved data literacy, better data presentation methods, and more transparent reporting practices. In the end, this experience serves as a microcosm of the challenges and opportunities in the open data landscape. It highlights the potential for open data to inform decision-making, while also revealing the hurdles that must be overcome to realize this potential fully. As we move forward, addressing these challenges will be crucial in creating a more transparent, accessible, and useful open data ecosystem for all.
Image from Freepik