The Study

Exploring the Recruitment Pipeline Fallacy: Resume Skill Gaps and Boolean Bury Underrepresented Females/Minorities

A groundbreaking study from Talenya has found that the problem in finding diverse talent is not due to lack of quality talent, but the failure of traditional boolean based search tools that recruiters use to find them.

April 2021

Table of Contents


Diverse talent write resumes in a way that makes it harder for traditional talent sourcing tools to find them.

Diversity and Inclusion (D&I) are top and center for businesses around the world. This is not just because of the political environment, but also due to the fact that they make a good business case for companies.

A 5 year study conducted by McKinsey & Company has shown “a positive, statistically significant correlation between company financial outperformance and diversity, on the dimensions of both gender and ethnicity”.

Companies are realizing the benefits of a more diversified workforce and are taking action to build one. McKinsey reports that “Diversity Leaders are taking bold and courageous steps to build fairer and more inclusive workplace cultures at all levels of the organization” but also that “the progress is slow”.

When speaking with many talent acquisition leaders, we found growing frustrations about the difficulties of finding and hiring more diverse talent.

There were no apparent reasons why recruiters struggle to find diverse talent. In our research, we found no scarcity of quality talent among women and minorities.

However, the underlying real reasons became apparent when we looked deeper into the data and the methods used to source diverse talent. We embarked on a journey to identify the reasons for the limited success in hiring diverse talent.

Talenya’s researchers have reviewed and analyzed over 20 million profiles, from our database as well as from a variety of public sources, in an effort to find out whether profiles of diverse talent are different than those of non-diverse talent, which could possibly make them harder to find.

Our study uncovered incredible findings:

  • Diverse talent write resumes differently, and
  • The methods used by recruiters to find talent give an advantage to non-diverse talent resumes.

This report outlines our findings and describes the unique ways diverse talent describe themselves. It also identifies the differences between the unique diversity groups and the reasons traditional talent sourcing tools are failing to uncover a significant proportion of diverse talent.


Doron Segal

Doron Segal
Chief Technology Officer


David Marcus​
Chief Data Scientist


Limited data on talent is at the core of the problem.​

With primarily one main source for data, recruiters are bound to miss talent that have partial or no LinkedIn profiles.

Keyword search is limited and discriminatory.

When using keywords to find talent, you are finding talents which have the right keywords in their profiles but not necessarily the best talents.

Female & minorities describe themselves in a unique way​

Each diversity category writes resumes in a unique way, including skills they post, photos, and the amount of text they enter.

Women tend to write less text on their profiles.​

Overall, women's profiles had 34.2% less skills than those of men.

Limited data on talent is at the core of the problem.

With primarily one main source for data, recruiters are bound to miss talents that have partial or no LinkedIn profiles.

When it comes to finding “passive” job seekers, LinkedIn is almost an exclusive source used by recruiters. With hundreds of millions of profiles (resumes), this social network has revolutionized the recruitment landscape and democratized access to talent data. It has also made it incredibly easy to engage with talent, using integrated messaging tools.

However, in today’s digital world, talent leave digital footprints in many places. For example, engineers may have more information on their skills on sites such as GitHub and StackOverflow, than on LinkedIn.

To find full information on talent in general and on diverse talent in particular, recruiters have to perform searches on dozens of sources including Google.

Because talent data is often outdated, limited and dispersed, any talent search is bound to be limited and partial.

Keyword search is limited and discriminatory.

When using keywords to find talent, you are finding talents which have the right keywords in their profiles but not necessarily the best talents.

For many years, keyword search (typically expressed using Boolean logic) has become the most popular way to find talent. The recruiter enters a combination of keywords, usually connected via Boolean logic such as “AND” and “OR” and then looks at the results. Then, based on the quality of the search results, the recruiter adds or deletes keywords in a “trial and error” sequence.

Although this method is simple, it is also highly inefficient and prone to bias.

Firstly, talents tend to describe themselves in a variety of ways. For example, one candidate may call themselves a “Project Manager” and another, with an identical set of skills, “Program Manager”. Unless both job titles are entered in the search, some candidates may be missed from the search results.

Secondly, Boolean search is all about “YES” it’s there, or “NO” it isn’t.  For example, if a skill is entered as a “must have”, anyone without such a skill would be excluded from the search results, even if they had the skill, but simply failed to include it in their profile. Boolean search lacks the granularity of skill importance such as “Preferred”, “Advantage”, etc.

LinkedIn, as a social network, adds an additional barrier to the search. It only shows profiles of people who are connected to the searcher by up to 3 degrees of separation. The number of people you will find in your LinkedIn search is a function of the number and the types of people that are connected to you on LinkedIn, and the people who are connected to other such people.

As a result, if your social network on LinkedIn does not include, for example, many Black/African Americans, you are likely to have limited access to that talent pool.

In short, recruiters’ access to the diverse talent pool is hindered by their Boolean search expertise, by the amount of information diverse talent have on their profiles, and by their LinkedIn social network reach.

However, these limitations are further amplified by the unique ways that diverse talent describe themselves.

Females and minorities list less skills in their profiles than the baseline average

Each diversity category writes resumes in a unique way, including skills they post, photos, and the amount of text they enter.

Talenya’s researchers have analyzed over 20 million profiles across job titles, industries, and US locations (more about our research methods below).

Talenya’s innovative technology enables it to identify 4 diversity categories with 98% accuracy:

  • Female
  • Black/African American
  • Asian
  • Hispanic

The first step was to categorize each profile by its unique diversity category (some profiles may have 2 categories such as Black Female). We ensured that the level of participation of each diversity category is similar to their relative participation in the general population.

We did not focus on any particular industry. However, all the profiles we analyzed were of skilled workers, so the study may not be relevant to unskilled workers or talents that do not have an online, digital footprint.

Talenya’s researchers found quantifiable skill deficits in public profiles of diverse talent vis-à-vis their white/male counterparts.

Talenya used its proprietary predictive algorithms to identify missing skills for these individuals. For example, it looked at hundreds of thousands of profiles with similar job titles and employment history and found commonalities in skills among such candidates. It then used predictive algorithms to predict and add missing skills to profiles. For example, if 90% of Data Scientists had Python listed as a skill, Python would be added to profiles of the other 10% (and clearly identified as added missing skills).

We conclude that all candidates, but particularly diverse candidates, are selling themselves short on professional networking sites. It is important to note that without such added Inferred skills, some candidates, diverse or non-diverse, would be excluded from recruiters’ searches.

The chart presents the skill gaps across diversity categories. The grey bars relate to the skills found on all public profiles combined, while the blue bars relate to the total number of skills after adding inferred skills. Adding the inferred skills helps to further close the gap for diverse talent by 10% to 17%. The most significant increase is for Black/African American candidates.

Talenya performed an additional analysis to find out whether profiles belonging to diverse talent from different specialties like Finance, Customer Success, R&D, Sales and Marketing have a different number of skills. We found no significant differences across specialties and diversity categories.

Get Diversity Data Today!

Women tend to write less text on their profiles.

Profiles with less text are likely to appear lower on job search results.

Our study also revealed, surprisingly, that men and women tend to differ in the amount of text they put in their profiles. For the purpose of this study, we divided men and women into 3 groups, depending on the number of words they had in their profiles: short, medium and long. 

In the group with the shortest profiles, women outnumbered men by 12.3%. In the group with medium length profiles, that difference was 18.8% in favor of men. In the group with the longest profiles, men dominated again, with 71.4% more representation.

Overall, women’s profiles had 34.2% less skills than those of men. The amount of text entered on profiles, helps explaining why women have a lower number of skills listed on their profiles. We found that with more text on their profiles, men tend to mention more skills within such text.

Recruiters are likely to consider sparse text on a profile as a reason to overlook or simply ignore candidates. With so many profiles to review, recruiters prefer to look at profiles that are rich and full of content to help them make an educated decision on whether to contact such candidates and invite them to an interview.

Summary of findings and possible solutions

Diverse talent profiles have unique characteristics that put them at a disadvantage compared to other profiles, but solutions are available.

Finding quality talent is a difficult task because data is limited, outdated and dispersed.

Keyword search methods are antiquated and inefficient, but when it comes to finding diverse talent, such challenges are intensified by the unique way diverse talent describe themselves through their public profiles.

Diverse talent tend to enter less skills and text on their profiles, which makes it harder for recruiters to find them.

Talenya has addressed these challenges by developing new ways to search for talent. These new methods are not based on keywords, but on AI and machine learning technologies instead. Under such methods, recruiter real-time selections are used to determine priorities and preferences, and to refine the search accordingly.

AI finds every possible job title and skill permutation that talent may use on their profiles to identify every possible talent for the job, regardless of how they describe themselves.

It predicts and adds missing skills to profiles and as a result, levels the playing field for diverse talent who tend to post less skills and less text on their profiles.

Finally, Talenya uses AI technologies to recommend small changes to the search, that if accepted are likely to maximize diverse talent representation in the recruitment pipeline.

 Talenya’s AI sourcing platform taps into a talent pool of close to 1 billion profiles, increasing the chances for all talent to be considered, and especially the chances of underrepresented minorities and women to be found by recruiters.

Bringing underrepresented talent to the forefront of job searches is key to solving the issues of underrepresented talent populations.  

The so-called “diversity pipeline problem,” is a fallacy – in fact, there has never been a more talented, diverse population of recruitable talent across both race and gender. The problem is that tools and search processes used by recruiters are based on an antiquated technology (specifically, keyword or “Boolean” search) that is ill-equipped in finding and surfacing these qualified, diverse candidates.

Research methodology

We used a random sample of 20 million profiles from the Talenya database. The dataset included 1,182,492 job titles, 143,939 skills, all US based.

Profiles reflected a variety of seniorities, ranging from non-managers to senior managers.

4 diversity categories were analyzed: Female, Black/African American, Asian, and Hispanic.

We also analyzed males versus females, across ethnicity.

On our platform, gender and diversity are identified based on a multitude of data points, including photos, first name and last name, place of birth, native languages, schools, associations and affiliations, locations and more. Diversity category is validated through supervised learning.

Skills are derived using Talenya’s unique NLP algorithm, supported by a vast taxonomy developed by Talenya. Skills are identified whether presented on profiles separately or embedded in profile text.

Our proprietary Talent Search Engine is able to search millions of records per second to find matches using a sophisticated query language that allows the engine to apply relative importance when matching each of the data categories in profiles (from optional to must-have), and sophisticated search logic. The results of searches are then scored (on a scale of 0 to 100) and the Top-N matches are displayed to the recruiter.

The Talent Search Engine also incorporates an advanced statistics engine to help identify the statistics described in this paper.

This allows a quick discovery of the number of talents in any search category (diversity, skills, job titles, etc.). This also makes the findings much easier to identify.

About Talenya

Founded in 2017 by Gal Almog and Doron Segal, Talenya has developed the world’s most advanced, diverse talent sourcing solution, enabling talent acquisition teams to uncover and engage with 3X more diverse talent than any other tool.

Talenya’s Diversity AI™ collects fresh data from hundreds of sources to build rich, updated talent profiles. It uses AI and Machine Learning technologies to eliminate old school keyword search and to intelligently prioritize talent by their quality and by their propensity to change jobs. Talenya’s  Diversity AI™  helps recruiters eliminate bias and increase the participation of diverse talent in the recruitment process. Talenya’s solution is used by the world’s leading employers.

Increase the representation of quality diverse candidates in your talent pools  – with the #1 AI-powered diverse talent sourcing solution.