*UPDATE*: After a crushing Illini loss and the end to the men’s college basketball season UIUC BracketOdds’ top bracket in this set, #15,458,212 outscored all ESPN brackets, correctly predicting 57 of the 63 games and scoring 1,840 points!
This year, users submitted a record 26,567,887 men’s basketball brackets on ESPN. While the initially they generated over 200,000,000 brackets, Ulmschneider and Hyde limited their analysis to the first 26,567,887 generated to ensure a fair, equal-sized comparison.
As for women’s brackets, their top bracket was just two games short of a perfect bracket, easily taking the top spot amongst ESPN-submitted brackets!
201 million brackets in 3.5 days— Two Illinois CS researchers, a world-class algorithm, and NCSA computing power produced the massive number and are currently outperforming all men’s and women’s brackets submitted on ESPN. Even more, they turned the results into a searchable, public tool built on cutting-edge data structures.
Illini Men’s Basketball might not be the only ones bringing home hardware to Champaign-Urbana for their performance in the NCAA March Madness tournament.
As of now, an Illini student and faculty member hold the top spot among March Madness bracket entries this year. Thanks to data science, the University of Illinois’ BracketOdds model, and a couple of late nights, Siebel School of Computing and Data Science Teaching Professor Wade Fagen-Ulmschneider and Computer Science Student Lauren Hyde’s collection of brackets they generated is still in the running to outperform every bracket that was officially submitted on ESPN for both the men’s and women’s tournament.
In just three and a half days, from Selection Sunday at 5 p.m. to Wednesday night, the pair managed to produce 201 million men’s brackets, the best of which has 56 correct games out of 60 played so far, and a maximum of 1,870 points using the ESPN scoring method.

On the women’s side, they produced another 201 million brackets with just one game wrong. You can compare your bracket to each one.
While they can be credited for the massive output and application, Computer Science Professor Sheldon H. Jacobson had spent almost two decades studying tournament odds before Fagen-Ulmschneider came to him at lunch, requesting his coveted algorithm.
“Wouldn’t it be cool if we made a perfect bracket?” Fagen-Ulmschneider asked Jacobson. A question that isn’t easy to say no to. “So, we had the algorithm, we had the data, we had the compute, so it was time to just go crazy with it.”
At face value, it took 1.4 seconds to generate one bracket. Fine if you want to generate a few dozen, or even a few hundred thousand, but that wasn’t the goal here. After some optimization of the code and a transfer to NCSA’s Radiant cloud computing service, it was down to a few milliseconds per bracket.
But for Fagen-Ulmschneider, this was more than just pumping out brackets; it was an opportunity to learn and let others benefit as well— a chance to “nerd out.”

“I was able to get that code and optimize it some, but we didn’t want to just make brackets and put them in a text file,” he added, noting that it was the searchability that was now the goal, with a website that everyone could have fun with.
That’s where Hyde, a member of Fagen-Ulmschneider’s research group, came in. Her talent with ground-level complex projects would turn out to work well for squeezing out every bit of efficiency in indexing. The pair’s goal was to represent the bracket as minimally as possible, “making it fast and making it memory efficient,” as Hyde put it, so that when given a partial bracket selection from a user (say, they’ve picked winners for 10 games but left the rest open), it returns all matching brackets out of 201 million in milliseconds.
Initially, it took 10 minutes to do a linear search of every bracket for one query, and “there was no way” they could put out something that slow.
So, each bracket needed to be stored as compactly as possible. The solution was a 63-bit bitstring — one bit per game, 0 or 1 depending on which team won. Hyde maintained a bitmap of length 201 million, where each bit represents whether a given bracket has that outcome. When a user inputs their selection, it’s then just a series of bitwise operations across the relevant bitmaps.
This is “kind of the obvious way to do it,” per the duo, but building everything on top of it effectively was the challenge.
This was a hardware challenge, too. Their first virtual machine had only 4GB of RAM— barely enough to hold the raw brackets and IDs, let alone run a search index on top of them. The server kept crashing, so they upgraded to 16 GB.
They made progress: around 10 million brackets in under a second, but Fagen-Ulmschneider knew this wasn’t enough lottery tickets to have a chance at the crown.
“I was like, Lauren, ‘I may have accidentally run this algorithm overnight, and I may accidentally have 200 million brackets. What can you do with this?’” he said.
Hyde adapted, switching over to the library Roaring Bitmaps, which added crucial compression and had a lot of optimizations under the hood. Still, there were problems. The server would crash, and every time it had to rebuild the entire bitmap index from scratch, which took about a minute. For a public-facing website, that’s unacceptable.
“You’re not going to wait for more than a few seconds on a scrolly scroll bar or a loading symbol, before you click away…especially for something that’s just for fun,” Fagen-Ulmschneider said.
Hyde solved this by writing pre-processing scripts that serialized the fully-built index to disk as a file. On restart, the server loads the pre-built file rather than recomputing it, cutting startup time dramatically.
From 10 minutes to a few milliseconds, the two bent the problem to their will in less than four days and have a winning bracket to show for it, not to mention a lightning-fast application for anyone to use. For them, though, it was just another example of what data science research can offer.
“I’ve been in Wade’s research group since my freshman Fall, and we still have new projects. This is something that’s not like anything we’ve ever done before,” Hyde said. “You get a lot of really interesting opportunities that you wouldn’t necessarily see in your classes.”
Learn More and Get Involved
Explore this project in more detail at their website.
Contact the Office of Data Science Research to tell us about other people or resources we could feature here. ODSR is a campuswide convening organization that facilitates collaborations, resource sharing, and public engagement focused on data science research activities at the University of Illinois.