Using Artificial Intelligence for Diversity and Inclusion Insights

22nd Feb 2024 by Mark Holt

At Divrsity, our purpose has always been to help employers uncover data-driven insights that enable them to make measurable progress on workplace Diversity, Inclusion, Equity, Bias and Belonging.

Consequently, we've been using Artificial Intelligence / Machine Learning for several years. For example, the Results pages of our EDI Surveys automatically highlight demographics who are excluded, experience bias, and/or have a different workplace experience to their colleagues. We also use AI to exclude survey responses that attempt to "poison" the results.

However, the recent rise of Artificial Intelligence tools (such as ChatGPT) enable us to gain a much more nuanced and unbiased understanding; providing deep insight into the lived experience of different demographics.

Background

Our focus on data-driven insights means that we've historically only used multiple-choice questions in our surveys, and shied-away from allowing our customers/partners to include free-text input. While potentially powerful, having to manually uncover insights from hundreds of long text responses is a) time-consuming and b) inherently biased by the reader's own background and lived-experience.

AI Tools and Bias: We know that many people in the D&I space are extremely concerned about AI's being biased and potentially enhancing the disadvantage that some groups experience (e.g. organisations using AI to process CVs).

However, we are explicitly asking the AI to avoid having any kind of opinion about the data that is provided; meaning that we can avoid bias while consuming and summarising vast amounts of free-text...

So what's new?

From today, all our surveys include an optional free text field where participants can write about their experiences of D&I in their workplace.

These "verbatim" responses are available for survey administrators to read and, once the survey is complete, we use our AI to summarise the contents by demographic. i.e. we combine the verbatim responses with the multiple choice survey responses in order to provide another lens info workplace D&I.

For example, the AI might extract the fact that a wide-variety of individuals don't really believe that senior leadership is committed to D&I. That's a powerful data-point that is difficult to share but incredibly important.

Alternatively, it might read the responses and determine that middle-aged white men are consistently negative about D&I because "all this woke stuff is unnecessary". At the same time, many of the female demographic mention mansplaining and the non-white community are consistently raising concerns about micro-aggressions.

Important: AI Tools and Data Privacy

At Divrsity, we are obsessed with anonymity and privacy. We take our responsibilities as a GDPR Data Controller extremely seriously, and we do everything in our power to protect the anonymity of survey participants.

So it's important to start our discussion of AI tools by saying that we absolutely DO NOT use cloud-based AI's such as ChatGPT or Google Gemini ! Notwithstanding that both are hosted in the US (making them non-starters from a GDPR perspective), we also cannot be certain that our data won't be used to train subsequent iterations of their AI; or that a slip-up (either by us or by them) won't accidentally expose highly sensitive data to their entire global customer-base.

Instead, we use our own AI tool which we run in the dedicated Divrsity hosted environment. This ensures that nobody outside Divrsity ever gets access to survey data, that our customers benefit from our obsession with anonymity and enables us to tweak the AI model to our own use-case.

How it works

Please note that this feature is still experimental: We are continuously tuning the Divrsity AI model and, while we have done everything to ensure that responses are fair, honest and factual, the nature of Large Language Models mean that we can't control every aspect of the response.

First of all, we'll need to configure the survey to support the verbatim field. Because the feature is still experimental, you might need to switch this on manually.

Now, once survey participants have completed the body of the survey, they will see a page with a free text field where they can write about their experience of Diversity & Inclusion at their workplace. This works for both regular surveys and Collective Surveys

When the survey is complete, survey administrators can use the new verbatims page to view the responses. Unhelpful or rude responses can be hidden and excluded from the analysis.

At the top of the verbatims is the AI-generated synthesis of the data. We feed the AI with some of the key survey response data so that we can summarise opinions and attitudes by demographic. The screenshot to the right is an anonymised response from a real survey that we recently ran.

Behind the scenes...

Warning: Here comes the science bit

The Divrsity AI is based on a customised version of Mixtral: a super-cool, European Artificial Intelligence that competes with OpenAI's ChatGPT, as well as Facebook's Llama3 and Google's Gemini.

We use Mistral's "Mixtral" engine that does a great job with tasks such as common sense reasoning by using a clever "mixture-of-experts" model: meaning that the AI consists of eight different "experts" that all attempt to understand the inputs. The model automatically selects the most meaningful results from these eight experts to create its responses.

After much trial and error, we settled on this model as offering the best balance between cost and performance: because we're mostly asking for the model to summarise the data and generate helpful insights (and we explicitly don't want it to be creative), we're able to get great results from the 8x7B Mistral model (7B meaning 7.3 billion parameters), rather than requiring the huge cost of a model like the full-fat Llama with 70 billion parameters.

N.B. It's fair to say that Llama3 is an absolutely HUGE improvement over Llama2 and is incredibly close on performance to Mixtral. Also worth noting that Microsoft's recently released Phi3 model (which is trained on a high-quality data-set and designed to run on small devices) is a very credible alternative. For our use case, the league table (from best to worst) is as follows:

  1. Mistral's Mixtral 8x22b
  2. Meta's Llama3 70b
  3. Mistral's Mistral 8x22b
  4. Cohere's Command-R+ 35b
  5. Microsoft's Phi-3 3.8b
  6. Meta's Llama2 13b
  7. Bringing up the rear is Google's Gemma 7b which is truly awful!

Conclusion

It's fair to say that we've only scratched the surface of how AI can help generate D&I insights, and we will be constantly evolving the platform as the AI's evolve. The next version of Mistral already looks like it might be even better, and we expect the pace of AI improvements will increase significantly over the next 18 months.

Watch this space...



More Blog Articles