GDELT — the Global Database of Events, Language and Tone — ingests every major-language news article worldwide every 15 minutes, geocodes the mentions, and scores each article on the Goldstein conflict-cooperation scale (-10 to +10). It's the closest thing to a real-time global situation-awareness firehose that's both public and free.
The catch: GDELT is so much data that naïve queries return tens of thousands of events per day. Without the right filter pattern you'd give up before the second week.
This post is about how to wire GDELT for production conflict monitoring without drowning your ops channel.
The two GDELT datasets
Two different feeds, different shapes:
Events (GDELT 2.0)
CSV file dropped every 15 minutes, listing every CAMEO-coded event with actor-1, actor-2, action-code, lat/lon, Goldstein score, mention count, average tone. ~150k rows per file.
URL pattern:
http://data.gdeltproject.org/gdeltv2/YYYYMMDDHHMMSS.export.CSV.zip
Indexed via the GKG (Global Knowledge Graph) — themes, persons, organisations.
DOC API
Real-time article search with filters by country, time, theme, tone:
https://api.gdeltproject.org/api/v2/doc/doc?query=...&mode=ArtList&format=json
The DOC API is the right starting point for most teams — easier to filter, ready-to-render output.
The three filters that turn it usable
1. Country + theme combination
Don't query GDELT for "all conflict events." Query for "all events in [country] tagged with [CAMEO root]." Example for Lebanon conflict monitoring:
query=sourcecountry:LB AND theme:CONFLICT
mode=ArtList
format=json
maxrecords=50
The theme: filter uses GKG taxonomy — there are ~250 themes covering protest, armed conflict, sanctions, terrorism, displaced persons, infrastructure damage. Pick 3-5 themes that match your risk model. Skip the rest.
2. Goldstein severity
Every event gets a Goldstein score: -10 (most conflict) to +10 (most cooperation). For risk monitoring, filter to goldstein <= -5. Drops every "diplomat met diplomat" story. Catches every "armed group attacked checkpoint."
3. Mention threshold
GDELT's mentions field counts how many articles reference the event. Single-mention events are 90% noise (translation artefacts, syndication chains). Filter to mentions >= 3 and noise drops dramatically.
Wire up sketch
A practical query for monitoring Iraq + Syria for armed-conflict + sanctions events:
const url = new URL("https://api.gdeltproject.org/api/v2/doc/doc");
url.searchParams.set(
"query",
"(sourcecountry:IZ OR sourcecountry:SY) AND " +
"(theme:CONFLICT OR theme:ARMEDCONFLICT OR theme:SANCTIONS)",
);
url.searchParams.set("mode", "ArtList");
url.searchParams.set("format", "json");
url.searchParams.set("maxrecords", "100");
url.searchParams.set("sort", "datedesc");
const res = await fetch(url);
const articles = (await res.json()).articles ?? [];
// Geocode + severity per article, then geofence:
const matches = articles
.map(a => normaliseGdeltArticle(a))
.filter(a => a.severity >= 40 && a.mentions >= 3)
.filter(a => intersectsWatchZone(a.geo));
Geocoding pitfalls
GDELT geocodes mentions, not events. An article published in Reuters about an incident in Syria, picked up by ten outlets, will show up ten times — each potentially geocoded to the publisher rather than the incident. Filter on actionGeo not sourceGeo for incident location.
For high-value operations (energy, supply chain, embassies), pair GDELT with ACLED — ACLED is human-curated, lower volume, vastly higher precision. Use GDELT for breadth, ACLED for accuracy.
Tone analysis — useful, with caveats
GDELT scores every article on a tone scale. Negative tone = bad news. Useful as a secondary signal:
- Sudden tone drop in a country = sentiment break, often precedes an event
- Persistent low tone = ongoing crisis, baseline elevated
But tone analysis is fooled by satire, sarcasm and translation. Don't use tone as a primary trigger. Use it as a corroborating signal alongside event count and Goldstein score.
What this catches in practice
Three real signals that GDELT-based geopolitical risk monitoring catches reliably:
Sudden protest escalation: "protest" theme + Goldstein ≤ -3 + mentions >= 5 in last hour, geofenced to operating-country boundaries. Catches the moment a local protest goes from background to operational-impact.
Sanctions movements: "SANCTIONS" theme + actor-1 = government, geofenced to supplier-country boundaries. Catches new sanctions before the trade compliance team's daily digest.
Infrastructure damage: "infrastructure" theme + Goldstein ≤ -5, geofenced to pipeline / cable corridor polygons. Catches the public-reporting moment of an attack you may have already seen via AIS or seismic.
False-positive tax
GDELT noise levels:
- Without filters: 100,000+ events/day globally
- Country-only filter: ~500-2,000/day per country
- Country + theme + Goldstein ≤ -5: ~50-200/day per country
-
- mentions >= 3: ~10-30/day per country
For a watch zone over a single country, 10-30 events per day is workable if you also apply severity threshold per zone.
Free starter stack
To wire GDELT-based conflict monitoring this week:
- Pick 3-5 GKG themes relevant to your risk model
- Define country / region polygon zones (use Natural Earth boundaries)
- Poll DOC API every 15 minutes per zone (rate limits are generous)
- Score each article on a 0-100 scale using Goldstein + mentions + tone
- Fire alerts above severity threshold via Slack / webhook
Augur's GDELT integration wraps all of this with the 0-100 severity scale and the same geofencing pipeline used for the other 20 feeds. The data is free; the merge is the work.
When you outgrow GDELT
GDELT works for breadth + free. The day you need:
- Human-curated event records (ACLED, IISS, Janes)
- Source verification per event (most paid feeds)
- 24/7 analyst escalation (Crisis24, FactSet, Recorded Future)
- Sub-15-minute latency (commercial newsfeeds)
…you graduate to a paid feed. GDELT stays valuable as a corroborating signal even at that point.
The first 18 months of operational awareness fits inside GDELT + USGS + AIS + NHC + a few RSS feeds. That's the minimum viable OSINT stack.