Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the astra-sites domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the jetpack domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpforms-lite domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wordpress-seo domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: A função _load_textdomain_just_in_time foi chamada incorretamente. O carregamento da tradução para o domínio astra foi ativado muito cedo. Isso geralmente é um indicador de que algum código no plugin ou tema está sendo executado muito cedo. As traduções devem ser carregadas na ação init ou mais tarde. Leia como Depurar o WordPress para mais informações. (Esta mensagem foi adicionada na versão 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893
{"id":27681,"date":"2024-05-06T17:43:36","date_gmt":"2024-05-06T17:43:36","guid":{"rendered":"https:\/\/statplace.com.br\/?p=27681"},"modified":"2024-10-04T16:52:48","modified_gmt":"2024-10-04T16:52:48","slug":"automatizando-tarefas-de-rh-com-machine-learning","status":"publish","type":"post","link":"https:\/\/site.statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/","title":{"rendered":"Automatizando tarefas de RH com Machine Learning"},"content":{"rendered":"\n

Artigo de Gabriel Lafet\u00e1<\/em><\/p>\n\n\n\n

O trabalho do RH dentro de uma empresa tem se tornado cada dia mais importante. Dessa forma, talvez os profissionais de ci\u00eancia de dados e neg\u00f3cios n\u00e3o saibam o qu\u00e3o oneroso pode ser a busca por um profissional que preencha perfeitamente todos os requisitos de uma vaga anunciada pela sua empresa. Felizmente, a tecnologia j\u00e1 evoluiu a ponto de nos conceder fortes aliados nas buscas otimizadas, principalmente quando combinamos o Processamento de Linguagem Natural (NLP)<\/a> com m\u00e9todos de geometria anal\u00edtica como a Similaridade de cossenos.<\/strong><\/p>\n\n\n\n

Nesse sentido, entende-se NLP<\/strong> como o segmento do aprendizado de m\u00e1quina que traduz linguagem humana em valores estat\u00edsticos, que possuem enorme valor para an\u00e1lises de frequ\u00eancias de termos repetidos, sentimentos expressos e similaridade com outros conte\u00fados em grandes blocos de texto. Naturalmente, esse m\u00e9todo \u00e9 fortemente embasado por princ\u00edpios matem\u00e1ticos, como de geometria anal\u00edtica, no caso da similaridade de cossenos<\/strong>, para ent\u00e3o medir a semelhan\u00e7a entre duas senten\u00e7as convertidas em vetores.<\/p>\n\n\n\n

A similaridade de cossenos pode ser encontrada em qualquer livro universit\u00e1rio de geometria anal\u00edtica e \u00e9 amplamente utilizada para medir a semelhan\u00e7a entre dois vetores por via do cosseno do \u00e2ngulo compreendido entre os dois:<\/p>\n\n\n\n

\"\"<\/figure>\n\n\n\n

Sendo assim, ser\u00e1 demonstrado abaixo a aplica\u00e7\u00e3o desses conceitos em um case<\/em> pr\u00e1tico envolvendo a tarefa de RH mencionada acima.<\/p>\n\n\n\n

Estudo de Caso<\/h2>\n\n\n\n

Imagine ent\u00e3o que seja poss\u00edvel para o RH automatizar a busca por perfis profissionais com as melhores caracter\u00edsticas para as suas vagas, utilizando recursos do perfil do LinkedIn<\/a> como descri\u00e7\u00e3o, cargos anteriores e habilidades e compar\u00e1-los com uma \u00fanica senten\u00e7a do tipo “Procuro um profissional dedicado com capacidade anal\u00edtica e experi\u00eancia no varejo, que domine Pyhton, SQL, Excel, PySpark e algoritmos de machine learning”.<\/em> Interessante, n\u00e3o? Mas para isso, primeiramente precisamos modelar nosso problema da obten\u00e7\u00e3o de dados aos c\u00e1lculos das m\u00e9tricas.<\/p>\n\n\n\n

Os dados utilizados foram extra\u00eddos do Apify<\/strong>, um site que oferta web scrapers com um click com per\u00edodos de teste gr\u00e1tis. No caso, utilizamos o Linkedin Companies & Profiles Bulk Scraper <\/em>e pelos pr\u00f3prios par\u00e2metros \u00e9 poss\u00edvel colocar alguns filtros de assunto e localidade, portanto foi utilizado “Data Science” e “Ci\u00eancia de dados” como filtro de assunto e “Brasil” como filtro de localidade. Foram extra\u00eddos 2955 perfis, valor que pode ser alterado dependendo da disponibilidade e prefer\u00eancia do usu\u00e1rio.<\/p>\n\n\n\n

Em seguida, foram importados para o IDE python<\/strong> os pacotes necess\u00e1rios. Para o uso do NLP em si o pacote escolhido foi o NLTK<\/strong>, amplamente difundido no python.<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
import<\/span> pandas <\/span>as<\/span> pd<\/span><\/span>\nfrom<\/span> nltk<\/span>.<\/span>corpus <\/span>import<\/span> stopwords<\/span><\/span>\nfrom<\/span> nltk<\/span>.<\/span>tokenize <\/span>import<\/span> sent_tokenize<\/span>,<\/span> word_tokenize<\/span><\/span>\nfrom<\/span> nltk<\/span>.<\/span>stem <\/span>import<\/span> RSLPStemmer<\/span><\/span>\nimport<\/span> re<\/span><\/span>\nfrom<\/span> datetime <\/span>import<\/span> datetime<\/span><\/span>\nfrom<\/span> collections <\/span>import<\/span> Counter<\/span><\/span>\nimport<\/span> matplotlib<\/span>.<\/span>pyplot <\/span>as<\/span> plt<\/span><\/span>\nfrom<\/span> unidecode <\/span>import<\/span> unidecode<\/span><\/span>\nfrom<\/span> scipy<\/span>.<\/span>spatial <\/span>import<\/span> distance<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n

Extra\u00e7\u00e3o de valor anal\u00edtico<\/h2>\n\n\n\n

Primeiramente, existem tr\u00eas processos a serem realizados em um bloco de texto para extrair valor anal\u00edtico: Stemmiza\u00e7\u00e3o, que consiste em utilizar as ra\u00edzes das palavras (\u201ccien\u201d no lugar de cientista ou ci\u00eancia), remo\u00e7\u00e3o de stop words, ou seja, remover palavras sem valor anal\u00edtico com conectivos e artigos e a tokeniza\u00e7\u00e3o que divide a senten\u00e7a em termos individuais.<\/p>\n\n\n\n

A fun\u00e7\u00e3o abaixo realiza essas tarefas para a descri\u00e7\u00e3o do candidato:<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
def<\/span> <\/span>tratamento_palavras<\/span>(<\/span>col<\/span>):<\/span><\/span>\n    lista_descricoes_tratadas <\/span>=<\/span> <\/span>[]<\/span><\/span>\n<\/span>\n    stemmer <\/span>=<\/span> <\/span>RSLPStemmer<\/span>()<\/span><\/span>\n<\/span>\n# Para cada linha no dataset aplicar\u00e1 o tratamento<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> index<\/span>,<\/span> row <\/span>in<\/span> base_linkedin<\/span>.<\/span>iterrows<\/span>():<\/span><\/span>\n        bloco <\/span>=<\/span> row<\/span>[<\/span>col<\/span>]<\/span><\/span>\n        bloco_texto <\/span>=<\/span> <\/span>str<\/span>(<\/span>bloco<\/span>)<\/span><\/span>\n<\/span>\n        <\/span># Removendo caracteres especiais e tokenizando as palavras<\/span><\/span>\n        tokens <\/span>=<\/span> <\/span>word_tokenize<\/span>(<\/span>re<\/span>.<\/span>sub<\/span>(<\/span>r<\/span>'[<\/span>^<\/span>\\w\\s<\/span>]'<\/span>,<\/span> <\/span>''<\/span>,<\/span> bloco_texto<\/span>),<\/span> <\/span>language<\/span>=<\/span>'<\/span>portuguese<\/span>'<\/span>)<\/span><\/span>\n<\/span>\n        <\/span># Removendo as stop words e stemmizando<\/span><\/span>\n        palavras_stemmizadas <\/span>=<\/span> <\/span>[<\/span>stemmer<\/span>.<\/span>stem<\/span>(<\/span>unidecode<\/span>(<\/span>token<\/span>).<\/span>lower<\/span>())<\/span> <\/span>for<\/span> token <\/span>in<\/span> tokens <\/span>if<\/span> token <\/span>not<\/span> <\/span>in<\/span> sem_stop_words<\/span>]<\/span><\/span>\n<\/span>\n        lista_descricoes_tratadas<\/span>.<\/span>append<\/span>(<\/span>palavras_stemmizadas<\/span>)<\/span><\/span>\n<\/span>\n    <\/span># Criar um DataFrame com uma coluna chamada 'Tokens' que cont\u00e9m as listas<\/span><\/span>\n    df_stem <\/span>=<\/span> pd<\/span>.<\/span>DataFrame<\/span>({<\/span>'<\/span>Tokens<\/span>'<\/span>:<\/span> lista_descricoes_tratadas<\/span>})<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> df_stem<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n

O web scraper tamb\u00e9m extrai as habilidades que o potencial candidato colocou no seu perfil, assim tamb\u00e9m precisaremos criar uma fun\u00e7\u00e3o para analisa-las. No caso das habilidades do candidato, elas est\u00e3o distribu\u00eddas ao longo de 20 colunas nomeadas f”skills\/{i}”<\/em> onde i varia de 0 a 19. Al\u00e9m disso, a fun\u00e7\u00e3o concatena os 20 valores em uma coluna e aplica os passos para introdu\u00e7\u00e3o do NLP.<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
def<\/span> <\/span>tratamento_skill<\/span>():<\/span><\/span>\n    bag_of_skills <\/span>=<\/span> <\/span>[]<\/span><\/span>\n    stemmer <\/span>=<\/span> <\/span>RSLPStemmer<\/span>()<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> index<\/span>,<\/span> row <\/span>in<\/span> base_linkedin<\/span>.<\/span>iterrows<\/span>():<\/span><\/span>\n        all_skills <\/span>=<\/span> <\/span>[]<\/span><\/span>\n<\/span>\n# Itera sobre as colunas de habildiades<\/span><\/span>\n<\/span>\n        <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>20<\/span>):<\/span><\/span>\n            skill <\/span>=<\/span> row<\/span>[<\/span>f<\/span>"skills\/<\/span>{<\/span>i<\/span>}<\/span>"<\/span>]<\/span><\/span>\n<\/span>\n            <\/span>if<\/span> <\/span>not<\/span> pd<\/span>.<\/span>isna<\/span>(<\/span>skill<\/span>)<\/span> <\/span>and<\/span> <\/span>isinstance<\/span>(<\/span>skill<\/span>,<\/span> <\/span>str<\/span>):<\/span><\/span>\n                skill <\/span>=<\/span> skill<\/span>.<\/span>lower<\/span>()<\/span> <\/span># Tudo em letra min\u00fascula<\/span><\/span>\n                skill <\/span>=<\/span> re<\/span>.<\/span>sub<\/span>(<\/span>r<\/span>'[<\/span>^<\/span>a-zA-Z0-9\\s<\/span>]'<\/span>,<\/span> <\/span>''<\/span>,<\/span> skill<\/span>)<\/span> <\/span># Tira acentos e caracteres especiais<\/span><\/span>\n                skill_tokens <\/span>=<\/span> <\/span>word_tokenize<\/span>(<\/span>skill<\/span>)<\/span><\/span>\n                skill_stemmed <\/span>=<\/span> <\/span>[<\/span>stemmer<\/span>.<\/span>stem<\/span>(<\/span>token<\/span>)<\/span> <\/span>for<\/span> token <\/span>in<\/span> skill_tokens<\/span>]<\/span><\/span>\n                all_skills<\/span>.<\/span>extend<\/span>(<\/span>skill_stemmed<\/span>)<\/span><\/span>\n<\/span>\n        bag_of_skills<\/span>.<\/span>append<\/span>(<\/span>all_skills<\/span>)<\/span><\/span>\n<\/span>\n    df_bag_of_skills <\/span>=<\/span> pd<\/span>.<\/span>DataFrame<\/span>({<\/span>'<\/span>merged_skills<\/span>'<\/span>:<\/span> bag_of_skills<\/span>})<\/span> <\/span># Junta a informa\u00e7\u00e3o de todas as colunas em uma \u00fanica lista em uma coluna<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> df_bag_of_skills<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n

Por fim, a compara\u00e7\u00e3o dos dois data frames gerados com a frase de input ser\u00e1 dividida em 4 etapas: Cria\u00e7\u00e3o de vetores vazios, preenchimento dos vetores com informa\u00e7\u00f5es anal\u00edticas das frequ\u00eancias das palavras, equalizar em tamanho os vetores comparados, pois a caso o vetor da frase de input n\u00e3o tenha o mesmo tamanho do vetor da descri\u00e7\u00e3o do candidato, por exemplo, algum deles ter\u00e1 de ser completados com zeros, e por fim o c\u00e1lculo da similaridade.<\/p>\n\n\n\n

Criando os vetores vazios<\/h2>\n\n\n\n
<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
def<\/span> <\/span>vetores_desc<\/span>():<\/span><\/span>\n    lista_vetor_desc <\/span>=<\/span> <\/span>[]<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> index<\/span>,<\/span> row <\/span>in<\/span> bag_descricao<\/span>.<\/span>iterrows<\/span>():<\/span><\/span>\n        descricao <\/span>=<\/span> <\/span>len<\/span>(<\/span>row<\/span>[<\/span>'<\/span>Tokens<\/span>'<\/span>])<\/span> <\/span>*<\/span> <\/span>[<\/span>0<\/span>]<\/span><\/span>\n        lista_vetor_desc<\/span>.<\/span>append<\/span>(<\/span>descricao<\/span>)<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> lista_vetor_desc<\/span><\/span>\n<\/span>\ndef<\/span> <\/span>vetores_skill<\/span>():<\/span><\/span>\n    lista_vetor_skills <\/span>=<\/span> <\/span>[]<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> index<\/span>,<\/span> row <\/span>in<\/span> bag_skills<\/span>.<\/span>iterrows<\/span>():<\/span><\/span>\n        skills <\/span>=<\/span> <\/span>len<\/span>(<\/span>row<\/span>[<\/span>'<\/span>merged_skills<\/span>'<\/span>])<\/span> <\/span>*<\/span> <\/span>[<\/span>0<\/span>]<\/span><\/span>\n        lista_vetor_skills<\/span>.<\/span>append<\/span>(<\/span>skills<\/span>)<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> lista_vetor_skills<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n

Para o preenchimento dos vetores, ele contar\u00e1 cada palavra tokenizada o contar\u00e1 a frequ\u00eancia dessa palavra que ser\u00e1 contabilizada no vetor com base no \u00edndice, assim garantindo que ser\u00e3o comparados tokens iguais.<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
def<\/span> <\/span>preench_vetor<\/span>():<\/span><\/span>\n<\/span>\n    vetor1 <\/span>=<\/span> <\/span>vetores_desc<\/span>()<\/span><\/span>\n    vetor2 <\/span>=<\/span> <\/span>vetores_skill<\/span>()<\/span><\/span>\n<\/span>\n    frase_consulta <\/span>=<\/span> lista_frase_stem<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> index<\/span>,<\/span> row <\/span>in<\/span> bag_descricao<\/span>.<\/span>iterrows<\/span>():<\/span><\/span>\n        <\/span>for<\/span> palavra <\/span>in<\/span> frase_consulta<\/span>:<\/span><\/span>\n            <\/span>if<\/span> palavra <\/span>in<\/span> row<\/span>[<\/span>'<\/span>Tokens<\/span>'<\/span>]:<\/span><\/span>\n                index_palavra1 <\/span>=<\/span> row<\/span>[<\/span>'<\/span>Tokens<\/span>'<\/span>].<\/span>index<\/span>(<\/span>palavra<\/span>)<\/span><\/span>\n                <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>0<\/span>,<\/span> <\/span>len<\/span>(<\/span>vetor1<\/span>)):<\/span><\/span>\n                    <\/span>if<\/span> index_palavra1 <\/span><<\/span> <\/span>len<\/span>(<\/span>vetor1<\/span>[<\/span>i<\/span>]):<\/span><\/span>\n                        vetor1<\/span>[<\/span>i<\/span>][<\/span>index_palavra1<\/span>]<\/span> <\/span>+=<\/span> <\/span>1<\/span><\/span>\n<\/span>\n    <\/span>for<\/span> index<\/span>,<\/span> row <\/span>in<\/span> bag_skills<\/span>.<\/span>iterrows<\/span>():<\/span><\/span>\n        <\/span>for<\/span> palavra <\/span>in<\/span> frase_consulta<\/span>:<\/span><\/span>\n            <\/span>if<\/span> palavra <\/span>in<\/span> row<\/span>[<\/span>'<\/span>merged_skills<\/span>'<\/span>]:<\/span><\/span>\n                index_palavra2 <\/span>=<\/span> row<\/span>[<\/span>'<\/span>merged_skills<\/span>'<\/span>].<\/span>index<\/span>(<\/span>palavra<\/span>)<\/span><\/span>\n                <\/span>for<\/span> t <\/span>in<\/span> <\/span>range<\/span>(<\/span>0<\/span>,<\/span> <\/span>len<\/span>(<\/span>vetor2<\/span>)):<\/span><\/span>\n                    <\/span>if<\/span> index_palavra2 <\/span><<\/span> <\/span>len<\/span>(<\/span>vetor2<\/span>[<\/span>t<\/span>]):<\/span><\/span>\n                        vetor2<\/span>[<\/span>t<\/span>][<\/span>index_palavra2<\/span>]<\/span> <\/span>+=<\/span> <\/span>1<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> vetor1<\/span>,<\/span> vetor2<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n

Por fim, a fun\u00e7\u00e3o final calcula a similaridade dos cossenos ap\u00f3s equalizar os vetores:<\/p>\n\n\n\n

<\/circle><\/circle><\/circle><\/g><\/svg><\/span><\/path><\/path><\/svg><\/span>
def<\/span> <\/span>calc_sim_coss<\/span>():<\/span><\/span>\n    vetor1<\/span>,<\/span> vetor2 <\/span>=<\/span> <\/span>preench_vetor<\/span>()<\/span><\/span>\n    vet_b_d <\/span>=<\/span> <\/span>vetor_busca_desc<\/span>()<\/span><\/span>\n    vet_b_s <\/span>=<\/span> <\/span>vetor_busca_skill<\/span>()<\/span><\/span>\n<\/span>\n    dist1 <\/span>=<\/span> <\/span>[]<\/span><\/span>\n    dist2 <\/span>=<\/span> <\/span>[]<\/span><\/span>\n<\/span>\n    <\/span># Ajustar o tamanho dos vetores em vet_b_d e vetor1<\/span><\/span>\n    <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>vet_b_d<\/span>)):<\/span><\/span>\n        max_len1 <\/span>=<\/span> <\/span>max<\/span>(<\/span>len<\/span>(<\/span>vet_b_d<\/span>[<\/span>i<\/span>]),<\/span> <\/span>len<\/span>(<\/span>vetor1<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n<\/span>\n        <\/span>if<\/span> <\/span>len<\/span>(<\/span>vet_b_d<\/span>[<\/span>i<\/span>])<\/span> <\/span><<\/span> max_len1<\/span>:<\/span><\/span>\n            vet_b_d<\/span>[<\/span>i<\/span>]<\/span> <\/span>+=<\/span> <\/span>[<\/span>0<\/span>]<\/span> <\/span>*<\/span> <\/span>(<\/span>max_len1 <\/span>-<\/span> <\/span>len<\/span>(<\/span>vet_b_d<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n<\/span>\n        <\/span>if<\/span> <\/span>len<\/span>(<\/span>vetor1<\/span>[<\/span>i<\/span>])<\/span> <\/span><<\/span> max_len1<\/span>:<\/span><\/span>\n            vetor1<\/span>[<\/span>i<\/span>]<\/span> <\/span>+=<\/span> <\/span>[<\/span>0<\/span>]<\/span> <\/span>*<\/span> <\/span>(<\/span>max_len1 <\/span>-<\/span> <\/span>len<\/span>(<\/span>vetor1<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n<\/span>\n    <\/span># Ajustar o tamanho dos vetores em vet_b_s e vetor2<\/span><\/span>\n    <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>vet_b_s<\/span>)):<\/span><\/span>\n        max_len2 <\/span>=<\/span> <\/span>max<\/span>(<\/span>len<\/span>(<\/span>vet_b_s<\/span>[<\/span>i<\/span>]),<\/span> <\/span>len<\/span>(<\/span>vetor2<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n<\/span>\n        <\/span>if<\/span> <\/span>len<\/span>(<\/span>vet_b_s<\/span>[<\/span>i<\/span>])<\/span> <\/span><<\/span> max_len2<\/span>:<\/span><\/span>\n            vet_b_s<\/span>[<\/span>i<\/span>]<\/span> <\/span>+=<\/span> <\/span>[<\/span>0<\/span>]<\/span> <\/span>*<\/span> <\/span>(<\/span>max_len2 <\/span>-<\/span> <\/span>len<\/span>(<\/span>vet_b_s<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n<\/span>\n        <\/span>if<\/span> <\/span>len<\/span>(<\/span>vetor2<\/span>[<\/span>i<\/span>])<\/span> <\/span><<\/span> max_len2<\/span>:<\/span><\/span>\n            vetor2<\/span>[<\/span>i<\/span>]<\/span> <\/span>+=<\/span> <\/span>[<\/span>0<\/span>]<\/span> <\/span>*<\/span> <\/span>(<\/span>max_len2 <\/span>-<\/span> <\/span>len<\/span>(<\/span>vetor2<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n<\/span>\n    <\/span># Calcule as dist\u00e2ncias de similaridade cosseno para vet_b_d e vetor1<\/span><\/span>\n    <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>vet_b_d<\/span>)):<\/span><\/span>\n        <\/span>try<\/span>:<\/span><\/span>\n            dist1<\/span>.<\/span>append<\/span>(<\/span>1<\/span> <\/span>-<\/span> distance<\/span>.<\/span>cosine<\/span>(<\/span>vet_b_d<\/span>[<\/span>i<\/span>],<\/span> vetor1<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n        <\/span>except<\/span> <\/span>Exception<\/span>:<\/span><\/span>\n            <\/span>pass<\/span><\/span>\n<\/span>\n    <\/span># Calcule as dist\u00e2ncias de similaridade cosseno para vet_b_s e vetor2<\/span><\/span>\n    <\/span>for<\/span> i <\/span>in<\/span> <\/span>range<\/span>(<\/span>len<\/span>(<\/span>vet_b_s<\/span>)):<\/span><\/span>\n        <\/span>try<\/span>:<\/span><\/span>\n            dist2<\/span>.<\/span>append<\/span>(<\/span>1<\/span> <\/span>-<\/span> distance<\/span>.<\/span>cosine<\/span>(<\/span>vet_b_s<\/span>[<\/span>i<\/span>],<\/span> vetor2<\/span>[<\/span>i<\/span>]))<\/span><\/span>\n        <\/span>except<\/span> <\/span>Exception<\/span>:<\/span><\/span>\n            <\/span>pass<\/span><\/span>\n<\/span>\n    <\/span>return<\/span> dist1<\/span>,<\/span> dist2<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n

Conclus\u00e3o<\/h2>\n\n\n\n

Por fim, o resultado s\u00e3o duas listas: uma contendo a similaridade da frase de input com o vetor da descri\u00e7\u00e3o do perfil do candidato e outra contendo a similaridade da frase de input com as habilidades. \u00c9 poss\u00edvel ainda trazer para um dataframe final o URL do perfil <\/strong>para compor uma lista dos candidatos mais qualificados para a vaga representada pela sua frase de input, conforme os valores das m\u00e9trica empregada.<\/p>\n\n\n\n

Nota-se que essa abordagem utiliza um web scraper pronto e talvez n\u00e3o seja o ideal para o seu neg\u00f3cio ou para seu RH. Caso deseje obter op\u00e7\u00f5es mais personalizadas deve-se criar um plano para coleta de dados direcionados em um p\u00fablico ou local espec\u00edfico.<\/p>\n","protected":false},"excerpt":{"rendered":"

Artigo de Gabriel Lafet\u00e1 O trabalho do RH dentro de uma empresa tem se tornado cada dia mais importante. Dessa forma, talvez os profissionais de ci\u00eancia de dados e neg\u00f3cios n\u00e3o saibam o qu\u00e3o oneroso pode ser a busca por um profissional que preencha perfeitamente todos os requisitos de uma vaga anunciada pela sua empresa. …<\/p>\n

Automatizando tarefas de RH com Machine Learning<\/span> Leia mais »<\/a><\/p>\n","protected":false},"author":11,"featured_media":27683,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"","site-content-layout":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","spay_email":"","footnotes":""},"categories":[445,444],"tags":[386,38,46,124,385],"class_list":["post-27681","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ferramentas-e-tecnologias","category-machine-learning","tag-automatizacao","tag-data-science","tag-estatistica-2","tag-machine-learning","tag-rh"],"yoast_head":"\nAutomatizando tarefas de RH com Machine Learning - Statplace<\/title>\n<meta name=\"description\" content=\"Entenda como o Machine Learning pode ser uma arma poderosa no aux\u00edlio em atividades realizadas pelo RH de empresas.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"pt_BR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Automatizando tarefas de RH com Machine Learning - Statplace\" \/>\n<meta property=\"og:description\" content=\"Entenda como o Machine Learning pode ser uma arma poderosa no aux\u00edlio em atividades realizadas pelo RH de empresas.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"Statplace\" \/>\n<meta property=\"article:published_time\" content=\"2024-05-06T17:43:36+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-10-04T16:52:48+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ana Luiza\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. tempo de leitura\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/statplace.com.br\/#website\",\"url\":\"https:\/\/statplace.com.br\/\",\"name\":\"Statplace\",\"description\":\"A Estat\u00edstica ao alcance de todos.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/statplace.com.br\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"pt-BR\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#primaryimage\",\"inLanguage\":\"pt-BR\",\"url\":\"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png\",\"contentUrl\":\"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png\",\"width\":1920,\"height\":1080},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#webpage\",\"url\":\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/\",\"name\":\"Automatizando tarefas de RH com Machine Learning - Statplace\",\"isPartOf\":{\"@id\":\"https:\/\/statplace.com.br\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#primaryimage\"},\"datePublished\":\"2024-05-06T17:43:36+00:00\",\"dateModified\":\"2024-10-04T16:52:48+00:00\",\"author\":{\"@id\":\"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c\"},\"description\":\"Entenda como o Machine Learning pode ser uma arma poderosa no aux\u00edlio em atividades realizadas pelo RH de empresas.\",\"breadcrumb\":{\"@id\":\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#breadcrumb\"},\"inLanguage\":\"pt-BR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"In\u00edcio\",\"item\":\"https:\/\/statplace.com.br\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Automatizando tarefas de RH com Machine Learning\"}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c\",\"name\":\"Ana Luiza\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/statplace.com.br\/#personlogo\",\"inLanguage\":\"pt-BR\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g\",\"caption\":\"Ana Luiza\"},\"url\":\"https:\/\/site.statplace.com.br\/blog\/author\/analuiza\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Automatizando tarefas de RH com Machine Learning - Statplace","description":"Entenda como o Machine Learning pode ser uma arma poderosa no aux\u00edlio em atividades realizadas pelo RH de empresas.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/","og_locale":"pt_BR","og_type":"article","og_title":"Automatizando tarefas de RH com Machine Learning - Statplace","og_description":"Entenda como o Machine Learning pode ser uma arma poderosa no aux\u00edlio em atividades realizadas pelo RH de empresas.","og_url":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/","og_site_name":"Statplace","article_published_time":"2024-05-06T17:43:36+00:00","article_modified_time":"2024-10-04T16:52:48+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Escrito por":"Ana Luiza","Est. tempo de leitura":"5 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebSite","@id":"https:\/\/statplace.com.br\/#website","url":"https:\/\/statplace.com.br\/","name":"Statplace","description":"A Estat\u00edstica ao alcance de todos.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/statplace.com.br\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"pt-BR"},{"@type":"ImageObject","@id":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#primaryimage","inLanguage":"pt-BR","url":"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png","contentUrl":"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png","width":1920,"height":1080},{"@type":"WebPage","@id":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#webpage","url":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/","name":"Automatizando tarefas de RH com Machine Learning - Statplace","isPartOf":{"@id":"https:\/\/statplace.com.br\/#website"},"primaryImageOfPage":{"@id":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#primaryimage"},"datePublished":"2024-05-06T17:43:36+00:00","dateModified":"2024-10-04T16:52:48+00:00","author":{"@id":"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c"},"description":"Entenda como o Machine Learning pode ser uma arma poderosa no aux\u00edlio em atividades realizadas pelo RH de empresas.","breadcrumb":{"@id":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#breadcrumb"},"inLanguage":"pt-BR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/statplace.com.br\/blog\/automatizando-tarefas-de-rh-com-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"In\u00edcio","item":"https:\/\/statplace.com.br\/"},{"@type":"ListItem","position":2,"name":"Automatizando tarefas de RH com Machine Learning"}]},{"@type":"Person","@id":"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c","name":"Ana Luiza","image":{"@type":"ImageObject","@id":"https:\/\/statplace.com.br\/#personlogo","inLanguage":"pt-BR","url":"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g","caption":"Ana Luiza"},"url":"https:\/\/site.statplace.com.br\/blog\/author\/analuiza\/"}]}},"jetpack_featured_media_url":"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2024\/05\/imagemartigo.png","_links":{"self":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts\/27681","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/comments?post=27681"}],"version-history":[{"count":1,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts\/27681\/revisions"}],"predecessor-version":[{"id":27684,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts\/27681\/revisions\/27684"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/media\/27683"}],"wp:attachment":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/media?parent=27681"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/categories?post=27681"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/tags?post=27681"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}