Executives Driving Efficiency & Governments Predicting Voting Results with AI . . . But Where Does All the Data Come From?
Jordan Kelly • 22 June 2025

AI Expert Delves Into the Ethics Concerning the Sourcing of Data that 'Feeds' LLM Models

(By Columnist Jamie Munro, AI & Robotics Expert)


The last few years have seen Artificial Intelligence break out increasingly into mainstream usage. Its usefulness isn’t the question. The questions, at the user level, revolve around how to employ it for greatest effect (including competitively), and at the level of AI model training, how to advance AI models that ensure balanced, properly informed outputs from ethically-sourced inputs.


Executives are looking to drive efficiency improvements. Democratic governments are starting to use AI to predict voter outcomes while their counterparts in authoritarian regimes are using it to monitor citizen activities and crack down on dissent. Three in five doctors in the U.S. now report using Artificial Intelligence as part of their practice. Internal research at my firm (willowlearn.com) indicates that between 40 and 60 percent of teachers in the UK now use AI in some component of their role. The most eager early adopters of AI are university students - with the latest numbers from the UK's Higher Education Policy Institute showing 88 percent of university students admitting to the use of AI in their assignments.


You’re An AI User Whether You Know It or Not


You've almost certainly started to hear the names of various AI products coming up in conversations - names like ChatGPT, Gemini, Claude and Perplexity.


Even if you haven't deliberately used one of these products, you've definitely interacted with some sort of product or service employing AI in some way. If you've used Google over the last few months, you will have started noticing the "AI overview" section at the top of the results (congratulations – you are now an AI user).


So now that you are an AI user, you might be asking, what exactly is AI? To give a simple answer, “AIs” – in layman’s terms – are computer programs (“Large Language Models” – “LLMs”) that you can interact with using natural human language. Much like texting with a (very knowledgeable) friend. When you send a message to an LLM, it will respond back to you with its own message. What it's really doing is using some very complicated mathematics to predict, based on your input, what it is that you want to see, and then generating it for you. It's a bit of a simplification, but you can think of it as a beefed-up version of the predictive text system on your phone.


So How Are These Computer Programs Able to Predict What You Want to See?


This is where data comes in, and lots of it. AI models need to be shown millions (and trillions) of examples of human language in a process known as "training". This leads to another question - where does the data come from?


The answer to that question is sensitive.


Firstly, companies developing Artificial Intelligence LLMs for the marketplace need vast amounts of data to remain competitive. Many have been accused of employing unscrupulous methods to obtain it.


The issue of copyright around AI training materials is still an open question and probably one that can only be addressed with new legislation. It emerged earlier this year that Meta (Facebook’s parent company) used a large number of pirated books as part of its training data. That's just one of many copyright cases against AI giants currently making their way through the U.S. Courts system.


Copyright Dilemmas:  Is It Fair to Content Creators & Other Humans?


Proponents of greater copyright protections would argue that AI fundamentally breaks the current monetisation systems for content creators.


Under current systems, content creators list their content on search engines like Google and YouTube. Users search for topics they are interested in, the search engine shows the user adverts (which is how they make money), then the user clicks on an item that interests them. The creator of that content then has the chance to monetise their readers/viewers.


As users move away from search engines and towards Artificial Intelligence, AI services simply provide an authoritative answer to the user’s query without linking to an external resource. Previously, a user searching for “how to change a tyre” might find their answer on a mechanic’s website and decide to purchase some service from that mechanic’s business. But in the future, they will increasingly ask an AI how to change the tyre, and the AI will just tell them - even if the AI originally learned how to do that from the mechanic's website.


Is that fair to the original content creator (the mechanic, in this case)?


Proponents of reduced copyright protections would argue that the above argument would never be applied to humans. if I spent a year reading books about a particular topic and became an expert, and then I wrote my own book about the topic, nobody would argue that I'm just regurgitating the books I read (unless I was found to have plagiarised large chunks without attribution). The other argument is economic: if we start enforcing copyright protections for the creators of AI training material, the cost of these already expensive AI models will get even higher – and countries that don't care about copyright protections (such as China) will gain a massive advantage. 


Concentration of Control & Potential Misuse of Power


The second major data-related problem is one of control and power.


Everything an AI "knows" comes from its training data, so whoever decides what is included in the training data has immense power.


In a world where everyone gets their information from AI, the creators of AI will have massive influence over public opinion, much like the broadcasters and newspaper owners in the pre-social media world. Currently this power is concentrated in the hands of a few tech giants like Google, Meta and OpenAI.


Efforts to regulate these giants and AI more broadly could result in concentrating the power into even fewer hands.


AI Bias in Elections? And Why Do They All Sound the Same?


There have already been numerous accusations of bias made against major AI models; during the 2024 U.S. elections, for example.


Many users note that the major AI models, even ones from different developers, end up producing rather similar "facts". AI models are only as good as the training data and any biases in this data will certainly be reflected by the model. The similarities between model outputs can be explained if you consider that all the AI companies are broadly using the same training data – the contents of the internet.


The training data supplied to an AI model acts as a type of voting mechanism. The more times a particular idea or concept appears in the training data, the more likely the model is to reproduce it. This means that AI models are much more likely to adopt mainstream viewpoints, and less popular, more controversial or dissenting opinions and views are much less likely to be represented.


Are we heading towards a future where ordinary people only have access to the “official line”? And do we want the likes of Google, Meta and OpenAI deciding what that official line should be?


It’s too late to go back to a world without AI – that ship has already sailed. Countries, companies and individuals who fail to adopt AI will be left behind and out-competed by those who are already on board.

 

Humanity must bravely face the future and embrace the massive opportunities presented by Artificial Intelligence. But as we move towards that future, we cannot shy away from the issues introduced by AI. They must be tackled thoughtfully and they must be tackled head-on.


Artificial Intelligence is ultimately a technology built by humans, for humans, to make life better for us all. But whether the reality is a “for better or for worse” one, will be wholly dependent on whether we do address these fundamental and critical issues.


See Jamie Munro's full bio here.

___________________________


Recent Highlight Coverage: 


How Wellington REALLY Works:  The '5Ds' . . . and How Parliamentarians & Government Agencies Use These Against YOU

Other News, Reviews & Commentary

by Jordan Kelly 27 June 2025
Today, I bring you the second of my three-Part ‘up close and personal’ interview with Mayor Grant Smith of Palmerston North. In this segment of our interview, Mayor Smith talks about the relationship ‘that could be’ (and, he feels, should be) between the cities, towns and regions of the Lower North Island. He discusses the potential for the key population centres to pool resources and efforts in a broader inter-regional local government collaboration initiative. He’s particularly enthusiastic about what an inter-regional tourism industry could look like. 'Palmy' Mayor Moots 'Platinum Triangle' for Lower North Island The Lower North Island – “If we include New Plymouth and Napier” in this broad collective of regions – contributes 24 percent to the total Gross Domestic Product of New Zealand. “We’re talking about 26 different Councils in that geography,” Mayor Smith says. “If you look at the often very complementary industry bases across that broader area, you can see that there’s serious scope for a variety of forms of leverage. “Those leverage opportunities are commercial, civil, logistical, social . . . the list goes on. And we’re not capitalising on them to any real degree at all.” As a more narrowly focused example, he points to the relationship between Palmerston North and Wellington. Or perhaps more accurately, the lack of one. “If we look north for examples, Palmerston North should, on multiple levels, have a similar relationship with Wellington, as Hamilton has with Auckland,” says Smith. “But we don’t. “Wellington should be using us as their food basket, as their distribution and logistics centre, and as a workforce resource pool. "The Manawatu region supplies 30 percent of New Zealand's vegetables. Meantime, the wider Manawatu-Wanganui region is the country's sheep and beef capital, with all the major meatworks companies represented here. And there's the greatest concentration in the Southern Hemisphere of food scientists and innovators; over 3100 individual specialists. "Palmerston North is well-known for its logistics and distribution hubs, with its strategic central location. It's home to automotive, food and Defence suppliers. "And with good transport links and a daily commuter rail service, the capital should be tapping more actively into the skilled workforce we can help supply." Why Don’t We Create A Lower North Island ‘Tourism Mosaic’? “Also, in a wider sense, there’s a substantial – and also currently very much going-to-waste – opportunity to be profiling the lower North Island as a collective for tourism. “If we stop looking through just ‘the eyes of locals’, we can see so much variety and complementarity, in terms of tourism attractions between the regions. “We’ve not only got Wellington as the entry point, but everything that sits south of Taranaki and right across to Hawkes Bay, and then down and right across to the Wairarapa. There’s wineries, mountains, rugged coastlines and gushing rivers, sports, arts and culture . . . the list goes on. “It’s a gold mine from a tourism perspective – if you look at it collectively, and without the parochial lens that often comes with having lived in a specific region your whole life.” ‘The Platinum Triangle’ Smith calls this greater collection of regions the ‘Platinum Triangle’, with tongue-in-cheek reference to the oft-dubbed ‘Golden Triangle’ of Auckland, Hamilton and Tauranga. “We should be supporting each others’ natural strengths, and from a wider profile and tourism perspective, be a collective of population centres working together; creating a rock solid, vibrant, large-scale, Lower North Island ‘mosaic’. “Think how much more easily that sort of inter-linked, co-ordinated approach would make it to attract both investment and population growth. “And, especially in the current economic climate, we need that for our hotels and the broader hospitality sector.” On that note, back to Wellington: “We need Wellington and Wellington needs us. “Not only are they our biggest city and our Capital, they’re the Platinum Triangle’s inter-island linkage.” Smith says that, several years ago, he went down to Wellington on a very specific mission to seek out an audience with its tourism gurus in the government and local government sectors. “I put it to the Wellington mayor of the day. I said, ‘Guys, why don’t we do this thing together? The Women's Rugby World Cup, for example. Why don’t we do a bid together? Let Wellington lead and we’ll play a strong support role.’ “But they turned around and left us at the altar! They tried to win it alone – and completely failed and handed the victory to Northland and Auckland." Smith says that was just one of a good number of valiant attempts he’s made at inspiring joint initiatives over the years, but each has “failed big time” to get the tourism powers-that-be down in the Capital, on board. “They won’t accept that there are other cities on their doorstep, that offer scope for a wonderful collaboration . . . not even now, of all moments in time, when we could help them back up and out of this ditch they’ve fallen into. “They’re like a top-level sportsperson who’s in a real form slump and lost confidence,” he says. “The whole thing is fixable but they need to look in the mirror. “Just like ( with reference back to Part One of our interview ) they need to listen to their residents and their key sectors, including the business sector, they need to listen to partners and would-be partners, like other councils. “But they’ve got this attitude that, “We’re the big brother and you’re just a pimple.’ The Lost Wisdom of Bygone Eras “Right back as far as 150 years ago, the Wellington/Manawatu railway company saw the potential, and was instrumental – in the railways era – in making connections in many more ways than one . . . supporting a vibrant inter-regional tourism picture; just one example of the lost wisdom of bygone decades. “But there’s an arrogance today that’s unnecessarily holding Wellington back . . . and its timing is very bad right now, especially. "Wellington can’t be everything that we can all – collectively – be. “They don’t have the mountains, they don’t have the wine industry, they don’t have the rolling farmlands, and they don’t have the rural aspect that New Zealand is internationally known for. “Sure, they’ve got the creative sector . . . but, as special and unique as that is, such as with the spectacular World of Wearable Arts . . . our collective offering is so much greater.”
by Jordan Kelly 25 June 2025
PRESS RELEASE JUST IN FROM NEW ZEALAND TAXPAYERS' UNION
by Jordan Kelly 25 June 2025
Dirt-Biked-Out, Toxic-Smoked-Out & Stag-Testicles-in-the-Letterboxed-Out . . . It's Not What Your Rights Are In A Small Town Like Dannevirke (Nor What the Resource Management Act Says), It's WHO You Know . . . Or, In My Case, Who I DIDN'T.
by Jordan Kelly 22 June 2025
Shame . . . ALL the Cretins Who Were Involved in Wilfully Destroying this Dedicated Healthcare Worker's Life . . . AND to the Politicians Who Pulled the Infamous "5D's" Stunt to Ensure Her Story Never Saw Light
by Jordan Kelly 18 June 2025
Seems to me, you'd have to be a very special sort of a cretin to work for an outfit like this . . . IMHO
by Jordan Kelly 16 June 2025
Early Feedback Shows Strong Support for Proactive Pursuit of Strategic Collaboration
by Jordan Kelly 10 June 2025
‘Palmy’ Special Feature Series with Regional & Industry Leaders
by Jordan Kelly 4 June 2025
Hey PowerCo: I Do NOT Appreciate You Giving Out My Email Address to Research Companies . . . Especially Those Who Treat YOUR Customers with Utter Contempt (& Spam Them)
by Jordan Kelly 3 June 2025
Why You Should Teach ALL Employees to Value Your Brand
by Jordan Kelly 26 May 2025
Ministry of Social Development Employee Sprays Around A Client's Private Information, then Sends It to A Journalist 
Show More