Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the astra-sites domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the jetpack domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wpforms-lite domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wordpress-seo domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Notice: A função _load_textdomain_just_in_time foi chamada incorretamente. O carregamento da tradução para o domínio astra foi ativado muito cedo. Isso geralmente é um indicador de que algum código no plugin ou tema está sendo executado muito cedo. As traduções devem ser carregadas na ação init ou mais tarde. Leia como Depurar o WordPress para mais informações. (Esta mensagem foi adicionada na versão 6.7.0.) in /home/statplace/public_html/site/wp-includes/functions.php on line 6114

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home/statplace/public_html/site/wp-includes/functions.php:6114) in /home/statplace/public_html/site/wp-includes/rest-api/class-wp-rest-server.php on line 1893
{"id":26941,"date":"2023-03-01T14:17:24","date_gmt":"2023-03-01T14:17:24","guid":{"rendered":"https:\/\/statplace.com.br\/?p=26941"},"modified":"2024-10-04T17:57:59","modified_gmt":"2024-10-04T17:57:59","slug":"dimensionalidade","status":"publish","type":"post","link":"https:\/\/site.statplace.com.br\/blog\/dimensionalidade\/","title":{"rendered":"Dimensionalidade"},"content":{"rendered":"\n

Artigo escrito por Pedro Interaminense<\/p>\n\n\n\n

Em diversos problemas que envolvem Intelig\u00eancia Artificial e grandes volumes de dados, podem surgir problemas relacionados ao tempo de processamento, uso excessivo de recursos computacionais e modelos menos precisos. No entanto, com o avan\u00e7o da minera\u00e7\u00e3o de dados, \u00e9 poss\u00edvel lidar com esses problemas utilizando t\u00e9cnicas de redu\u00e7\u00e3o de dimensionalidade, que ajudam a simplificar o conjunto de dados e melhorar a efici\u00eancia e precis\u00e3o dos modelos de aprendizado\u00a0de\u00a0m\u00e1quina.<\/p>\n\n\n\n

Ao diminuir a dimens\u00e3o dos seus dados, voc\u00ea tem como objetivo deixar o treinamento dos dados mais r\u00e1pidos e auxiliar para o encontro de uma boa solu\u00e7\u00e3o para o problema proposto. Entretanto, ap\u00f3s a execu\u00e7\u00e3o da redu\u00e7\u00e3o perdemos algumas informa\u00e7\u00f5es, o que implica dizer que o modelo pode mitigar sua potencialidade sem que haja uma forte raz\u00e3o. Ent\u00e3o, a primeira coisa \u00e9 tentar treinar seu modelo com os dados originais antes de considerar a redu\u00e7\u00e3o da dimensionalidade.<\/p>\n\n\n\n

Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA<\/strong>, t_SNE<\/strong>, Truncated SVD<\/strong>.<\/p>\n\n\n\n

PCA<\/h2>\n\n\n\n

\u00c9 um procedimento num\u00e9rico que tenta encontrar uma combina\u00e7\u00e3o linear de vari\u00e1veis \u200b\u200boriginais que melhor capturem a vari\u00e2ncia dos dados. Em outras palavras, o PCA tenta projetar os dados em um espa\u00e7o de menor dimens\u00e3o, mantendo o m\u00e1ximo de informa\u00e7\u00e3o poss\u00edvel. Com isso, ele \u00e9 baseado em \u00e1lgebra linear e usa a decomposi\u00e7\u00e3o de valores singulares (SVD) para calcular as componentes principais.<\/p>\n\n\n\n

A an\u00e1lise de componentes principais (PCA)<\/a> gera informa\u00e7\u00f5es que permitem que voc\u00ea mantenha os componentes mais relevantes ao mesmo tempo em que preserva os segmentos mais importantes do conjunto geral de dados. Assim, h\u00e1 uma vantagem adicional, j\u00e1 que cada um dos novos destaques ou segmentos gerados ap\u00f3s a aplica\u00e7\u00e3o do PCA s\u00e3o, em geral, independentes uns dos outros.<\/p>\n\n\n\n

t-SNE<\/h2>\n\n\n\n

\u00c9 uma t\u00e9cnica de redu\u00e7\u00e3o de dimensionalidade n\u00e3o linear que age de forma indireta, sendo comumente adequada para conjunto de dados de alta dimens\u00e3o. A t-SNE \u00e9 um procedimento probabil\u00edstico que tenta preservar a estrutura de vizinhan\u00e7a dos dados originais em um espa\u00e7o de menor dimens\u00e3o. Em vez de focar na vari\u00e2ncia, o t-SNE tenta encontrar uma distribui\u00e7\u00e3o de probabilidade que reflita a semelhan\u00e7a entre pares de pontos nos dados originais. Al\u00e9m disso, utiliza a distribui\u00e7\u00e3o t-student para modelar a distribui\u00e7\u00e3o de probabilidade e usa uma abordagem iterativa para otimizar a proje\u00e7\u00e3o dos dados.<\/p>\n\n\n\n

A t-SNE tem um maior uso em problemas de manipula\u00e7\u00e3o de imagens, PNL, informa\u00e7\u00e3o gen\u00f4mica e prepara\u00e7\u00e3o do discurso. A t\u00e9cnica pode ser implementada pelo mapeamento das informa\u00e7\u00f5es multidimensionais para um espa\u00e7o de menor dimens\u00e3o e pesquisa padr\u00f5es que podem gerar informa\u00e7\u00f5es; de uma forma mais simples ele incorpora os pontos de uma dimens\u00e3o superior para uma dimens\u00e3o inferior tentando preservar a vizinhan\u00e7a daquele ponto.<\/p>\n\n\n\n

Diferen\u00e7a entre PCA e t-SNE<\/h2>\n\n\n\n

Apesar de ambos serem t\u00e9cnicas de redu\u00e7\u00e3o de dimensionalidade, em suma, algumas diferen\u00e7as podem ser notadas quando utilizarem da mesma.<\/p>\n\n\n\n

1\u00ba<\/strong>) O t-SNE tem um maior tempo de execu\u00e7\u00e3o se for aplicado h\u00e1 um conjunto de milh\u00f5es de observa\u00e7\u00f5es, al\u00e9m de ser computacionalmente caro. J\u00e1 o PCA finaliza a atividade em um menor per\u00edodo de tempo.<\/p>\n\n\n\n

2\u00ba<\/strong>) H\u00e1 uma diferen\u00e7a nos procedimentos: o PCA \u00e9 um procedimento num\u00e9rico, enquanto o t-SNE \u00e9 um procedimento probabil\u00edstico, ou seja, o PCA se concentra na vari\u00e2ncia dos dados e o t-SNE se concentra na semelhan\u00e7a entre pares de pontos nos dados originais.<\/p>\n\n\n\n

3\u00ba<\/strong>) O PCA \u00e9 sens\u00edvel a outliers<\/em> e o t-SNE sabe lidar melhor com esse problema.<\/p>\n\n\n\n

4\u00ba) Ele tenta preservar a estrutura global dos dados, o t-SNE tenta preservar a estrutura local (cluster<\/em>) de dados.<\/p>\n\n\n\n

Truncated SVD <\/h2>\n\n\n\n

O Truncated SVD tamb\u00e9m \u00e9 uma t\u00e9cnica de redu\u00e7\u00e3o de dimensionalidade, mais utilizada em dados com um alto n\u00famero de valores missings<\/em> ou com dados esparsos, como por exemplo: sistema de recomenda\u00e7\u00e3o de produtos em que cada usu\u00e1rio comenta ou classifica um produto, por\u00e9m uma grande quantidade de clientes n\u00e3o utilizam desse meio, dessa forma, gerando valores “zero” nos dados.<\/p>\n\n\n\n

O SVD utiliza a fatora\u00e7\u00e3o de matriz semelhante ao PCA, mas a diferen\u00e7a \u00e9 que a An\u00e1lise de Componentes Principais utiliza matriz de covari\u00e2ncia.<\/p>\n\n\n\n

A SVD truncada com matriz de dados fatorada explicada como o n\u00famero de colunas s\u00e3o iguais ao truncamento. Tamb\u00e9m ele exclui os d\u00edgitos ap\u00f3s a casa decimal para diminuir o valor dos d\u00edgitos flutuantes matematicamente. Por exemplo, 3,349 pode ser truncado para 3,5.<\/p>\n\n\n\n

Parte pr\u00e1tica<\/h2>\n\n\n\n

Biblioteca <\/h3>\n\n\n\n
import<\/strong><\/mark> numpy as<\/mark><\/strong> np\nimport<\/mark><\/strong> pandas as<\/mark><\/strong> pd\nfrom<\/mark><\/strong> dfply import<\/mark>*<\/mark><\/strong><\/code><\/pre>\n\n\n\n

Coleta de Dados<\/h3>\n\n\n\n
df_train =<\/mark><\/strong> pd.<\/mark>read_csv(<\/mark>'train.csv<\/mark>')<\/mark>\ndf_train >><\/mark><\/strong>head(<\/mark>3<\/mark>)<\/mark><\/code><\/pre>\n\n\n\n
<\/td>Id<\/strong><\/td>Home Ownership<\/strong><\/td>Annual Income<\/strong><\/td>Years in current job<\/strong><\/td>Tax Liens<\/strong><\/td>Number of Open Accounts<\/strong><\/td>Years of Credit History<\/strong><\/td>Maximum Open Credit<\/strong><\/td>Number of Credit Problems<\/strong><\/td>Months since last delinquent<\/strong><\/td>Bankruptcies<\/strong><\/td>Purpose<\/strong><\/td>Term<\/strong><\/td>Current Loan Amount<\/strong><\/td>Current Credit Balance<\/strong><\/td>Monthly Debt<\/strong><\/td>Credit Score<\/strong><\/td>Credit Default<\/strong><\/td><\/tr>
0<\/strong><\/td>0<\/td>Own Home<\/td>482087.0<\/td>NaN<\/td>0.0<\/td>11.0<\/td>26.3<\/td>685960.0<\/td>1.0<\/td>NaN<\/td>1.0<\/td>debt consolidation<\/td>Short Term<\/td>99999999.0<\/td>47386.0<\/td>7914.0<\/td>749.0<\/td>0<\/td><\/tr>
1<\/strong><\/td>1<\/td>Own Home<\/td>1025487.0<\/td>10+ years<\/td>0.0<\/td>15.0<\/td>15.3<\/td>1181730.0<\/td>0.0<\/td>NaN<\/td>0.0<\/td>debt consolidation<\/td>Long Term<\/td>264968.0<\/td>394972.0<\/td>18373.0<\/td>737.0<\/td>1<\/td><\/tr>
2<\/strong><\/td>2<\/td>Home Mortgage<\/td>751412.0<\/td>8 years<\/td>0.0<\/td>11.0<\/td>35.0<\/td>1182434.0<\/td>0.0<\/td>NaN<\/td>0.0<\/td>debt consolidation<\/td>Short Term<\/td>99999999.0<\/td>308389.0<\/td>13651.0<\/td>742.0<\/td>0<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n

Credit Default<\/strong> \u00e9 categoria de inadimpl\u00eancia no banco, na qual, 0 = bom pagador e 1 = inadimplente.<\/p>\n\n\n\n

Encoder<\/h3>\n\n\n\n

Para lidar com as vari\u00e1veis nominais (categ\u00f3ricas) vamos utilizar o Label Encoder, assim transformando em vari\u00e1veis bin\u00e1rias.<\/p>\n\n\n\n

from<\/mark><\/strong> sklearn import<\/strong><\/mark> preprocessing\n\nlabel_encoder =<\/mark><\/strong> preprocessing.LabelEncoder()<\/mark>\n\ndf_train[<\/mark>'Home Ownership'<\/mark>]<\/mark> =<\/mark><\/strong> label_encoder.<\/mark>fit_transform(<\/mark>df_train[<\/mark>'Home Ownership'<\/mark>])<\/mark>\ndf_train[<\/mark>'Purpose'<\/mark>]<\/mark> =<\/mark><\/strong> label_encoder.<\/mark>fit_transform(<\/mark>df_train[<\/mark>'Purpose'<\/mark>])<\/mark>\ndf_train[<\/mark>'Term'<\/mark>]<\/mark> =<\/mark><\/strong> label_encoder.<\/mark>fit_transform(<\/mark>df_train[<\/mark>'Term'<\/mark>])<\/mark>\n#c41a1adf_train[<\/mark>'Years in current job'<\/mark>]<\/mark> =<\/mark><\/strong> label_encoder.<\/mark>fit_transform(<\/mark>df_train[<\/mark>'Years in current job'<\/mark>])<\/mark><\/code><\/pre>\n\n\n\n

Preenchendo valores NaN<\/strong><\/h3>\n\n\n\n
df_train.<\/mark>fillna(<\/mark>-<\/mark>99999<\/mark>, inplace=<\/mark><\/strong>True<\/strong><\/mark>)<\/mark><\/code><\/pre>\n\n\n\n

Separando as vari\u00e1veis X e Y<\/strong><\/h3>\n\n\n\n
X =<\/mark><\/strong> df_train.<\/mark>drop(<\/mark>'Credit Default'<\/mark>, axis=<\/mark><\/strong>1<\/mark>)<\/mark>\ny =<\/mark><\/strong> df_train[<\/mark>'Credit Default'<\/mark>]<\/mark><\/code><\/pre>\n\n\n\n
from<\/mark><\/strong> sklearn.model_selection import<\/strong><\/mark> train_test_split\nx_treino, x_teste, y_treino, y_teste =<\/mark><\/strong> train_test_split(<\/mark>X, y,  test_size =<\/mark><\/strong> 0.7<\/mark>, random_state =<\/mark><\/strong> 0<\/mark>)<\/mark><\/code><\/pre>\n\n\n\n
from<\/mark><\/strong> sklearn.<\/mark>preprocessing import<\/strong><\/mark> StandardScaler\n    \nscaler =<\/mark><\/strong> StandardScaler()<\/mark>\nX_scaled =<\/mark><\/strong> scaler.<\/mark>fit_transform(<\/mark>X)<\/mark><\/code><\/pre>\n\n\n\n
import<\/strong><\/mark> time\nfrom<\/mark><\/strong> sklearn.manifold import<\/strong><\/mark> TSNE\nfrom<\/mark><\/strong> sklearn.<\/mark>decomposition import<\/strong><\/mark> PCA, TruncatedSVD\n\n\n# Implementa\u00e7\u00e3o do T-SNE<\/mark><\/em>\nt0 =<\/mark><\/strong> time.<\/mark>time()<\/mark>\nX_reduced_tsne =<\/mark><\/strong> TSNE(<\/mark>n_components=<\/mark><\/strong>2<\/mark>, random_state=<\/mark><\/strong>42<\/mark>)<\/mark>.<\/mark>fit_transform(<\/mark>X_scaled)<\/mark>\nt1 =<\/mark><\/strong> time.<\/mark>time()<\/mark>\nprint(<\/mark>\"T-SNE took {:.2} s\"<\/mark>.<\/mark>format(<\/mark>t1 -<\/mark> t0))<\/mark>\n\n# Implementa\u00e7\u00e3o do PCA<\/mark><\/em>\nt0 =<\/mark><\/strong> time.<\/mark>time()<\/mark>\nX_reduced_pca =<\/mark><\/strong> PCA(n_components=<\/mark><\/strong>2<\/mark>, random_state=<\/mark><\/strong>42<\/mark>)<\/mark>.<\/mark>fit_transform(<\/mark>X_scaled)<\/mark>\nt1 =<\/mark><\/strong> time.<\/mark>time()<\/mark>\nprint(<\/mark>\"PCA took {:.2} s\"<\/mark>.format(<\/mark>t1 -<\/mark> t0))<\/mark>\n\n# TruncatedSVD<\/mark><\/em>\nt0 =<\/mark><\/strong> time.<\/mark>time()<\/mark>\nX_reduced_svd =<\/mark><\/strong> TruncatedSVD(<\/mark>n_components=<\/mark><\/strong>2<\/mark>, algorithm=<\/mark><\/strong>'randomized'<\/mark>, random_state=<\/mark><\/strong>42<\/mark>)<\/mark>.<\/mark>fit_transform(<\/mark>X_scaled)<\/mark>\nt1 =<\/mark><\/strong>time.<\/mark>time()<\/mark>\nprint(<\/mark>\"Truncated SVD took {:.2} s\"<\/mark>.<\/mark>format(<\/mark>t1 -<\/mark> t0))<\/mark><\/code><\/pre>\n\n\n\n

T-SNE took 1.1e+02 s
PCA took 2.2 s
Truncated SVD took 0.047 s<\/p>\n\n\n\n

Criando Scatter Plot das dimens\u00f5es reduzidas<\/strong><\/h3>\n\n\n\n
import<\/strong><\/mark> matplotlib.pyplot as<\/mark><\/strong> plt\nimport<\/strong><\/mark> matplotlib.patches as<\/mark><\/strong> mpatches\n\nf, (<\/mark>ax1, ax2, ax3)<\/mark> =<\/mark><\/strong> plt.subplots(<\/mark>1<\/mark>, 3, figsize=<\/mark><\/strong>(<\/mark>24,6))<\/mark>\n# labels =<\/mark><\/strong> [<\/mark>'No Fraud', 'Fraud']\nf.<\/mark>suptitle(<\/mark>'Clusters que utilizam a Redu\u00e7\u00e3o de Dimenensionalidade'<\/mark>, fontsize=<\/mark><\/strong>14)<\/mark>\n\n\nblue_patch =<\/mark><\/strong> mpatches.Patch(<\/mark>color=<\/mark><\/strong>'#0A0AFF'<\/mark>, label=<\/mark><\/strong>'N\u00e3o inadimplente'<\/mark>)<\/mark>\nred_patch =<\/mark><\/strong> mpatches.Patch(<\/mark>color=<\/mark><\/strong>'#AF0000'<\/mark>, label=<\/mark><\/strong>'Inadimplente'<\/mark>)<\/mark>\n\n\n# t-SNE scatter plot<\/mark><\/em>\nax1.<\/mark>scatter(<\/mark>X_reduced_tsne[<\/mark>:,0<\/mark>], X_reduced_tsne[<\/mark>:,1<\/mark>]<\/mark>, c=<\/mark><\/strong>(<\/mark>y ==<\/mark><\/strong> 0<\/mark>)<\/mark>, cmap=<\/mark><\/strong>'coolwarm'<\/mark>, label=<\/mark><\/strong>'N\u00e3o Fraude'<\/mark>, linewidths=<\/mark><\/strong>2)<\/mark>\nax1.<\/mark>scatter(<\/mark>X_reduced_tsne[<\/mark>:,0<\/mark>], X_reduced_tsne[<\/mark>:,1<\/mark>]<\/mark>, c=<\/mark><\/strong>(<\/mark>y ==<\/mark><\/strong> 1<\/mark>)<\/mark>, cmap=<\/mark><\/strong>'coolwarm'<\/mark>, label=<\/mark><\/strong>'Fraude'<\/mark>, linewidths=<\/mark><\/strong>2)<\/mark>\nax1.<\/mark>set_title(<\/mark>'t-SNE'<\/mark>, fontsize=<\/mark><\/strong>14)<\/mark>\n\nax1.<\/mark>grid(<\/mark>True<\/strong><\/mark>)<\/mark>\n\nax1.<\/mark>legend(<\/mark>handles=<\/mark><\/strong>[<\/mark>blue_patch, red_patch])<\/mark>\n\n\n# PCA scatter plot<\/mark><\/em>\nax2.<\/mark>scatter(<\/mark>X_reduced_pca[<\/mark>:,0<\/mark>]<\/mark>, X_reduced_pca[<\/mark>:,1<\/mark>]<\/mark>, c=<\/mark><\/strong>(<\/mark>y ==<\/mark><\/strong> 0<\/mark>)<\/mark>, cmap=<\/mark><\/strong>'coolwarm'<\/mark>, label=<\/mark><\/strong>'N\u00e3o Fraude', linewidths=<\/mark><\/strong>2)<\/mark>\nax2.<\/mark>scatter(<\/mark>X_reduced_pca[<\/mark>:,0<\/mark>]<\/mark>, X_reduced_pca[<\/mark>:,1<\/mark>]<\/mark>, c=<\/mark><\/strong>(<\/mark>y ==<\/mark><\/strong> 1<\/mark>)<\/mark>, cmap=<\/mark><\/strong>'coolwarm'<\/mark>, label=<\/mark><\/strong>'Fraude', linewidths=<\/mark><\/strong>2)<\/mark>\nax2.<\/mark>set_title(<\/mark>'PCA'<\/mark>, fontsize=14)<\/mark>\n\nax2.<\/mark>grid(<\/mark>True<\/strong><\/mark>)<\/mark>\n\nax2.<\/mark>legend(<\/mark>handles=[<\/mark>blue_patch, red_patch])<\/mark>\n\n# TruncatedSVD scatter plot<\/em><\/mark>\nax3.<\/mark>scatter(<\/mark>X_reduced_svd[<\/mark>:,0<\/mark>]<\/mark>, X_reduced_svd[<\/mark>:,1<\/mark>]<\/mark>, c=<\/mark><\/strong>(<\/mark>y ==<\/mark><\/strong> 0<\/mark>)<\/mark>, cmap=<\/mark><\/strong>'coolwarm'<\/mark>, label=<\/mark><\/strong>'N\u00e3o Fraude', linewidths=<\/mark><\/strong>2)<\/mark>\nax3.<\/mark>scatter(<\/mark>X_reduced_svd[<\/mark>:,0<\/mark>]<\/mark>, X_reduced_svd[<\/mark>:,1<\/mark>]<\/mark>, c=<\/mark><\/strong>(<\/mark>y ==<\/mark><\/strong> 1<\/mark>)<\/mark>, cmap=<\/mark><\/strong>'coolwarm'<\/mark>, label=<\/mark><\/strong>'Fraude', linewidths=<\/mark><\/strong>2)<\/mark>\nax3.<\/mark>set_title(<\/mark>'Truncated SVD'<\/mark>, fontsize=<\/mark><\/strong>14<\/mark>)<\/mark>\n\nax3.<\/mark>grid(<\/mark>True<\/strong><\/mark>)<\/mark>\n\nax3.<\/mark>legend(<\/mark>handles=<\/mark><\/strong>[<\/mark>blue_patch, red_patch])<\/mark>\n\nplt.<\/mark>show()<\/mark><\/code><\/pre>\n\n\n\n
\"Gr\u00e1fico<\/figure>\n\n\n\n

A diminui\u00e7\u00e3o das dimens\u00f5es geram dados menos esparsos e reduzir as informa\u00e7\u00f5es nem sempre estamos reduzindo a habilidade de predi\u00e7\u00e3o (por isso ele \u00e9 importante).<\/p>\n\n\n\n

Refer\u00eancias<\/strong><\/p>\n\n\n\n

Chicago. Chollet, Francois. 2017. Deep Learning with Python. New York, NY: Manning Publications.<\/p>\n\n\n\n

Provost, Foster, Fawcett, Tom. (2013). Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking . Sebastopol, California: O’Reilly.<\/p>\n\n\n\n

https:\/\/medium.com\/@lucasgmpaiva1\/redu%C3%A7%C3%A3o-de-dimensionalidade-6b98b360ff6a<\/a><\/p>\n\n\n\n

https:\/\/lvdmaaten.github.io\/tsne\/<\/a><\/p>\n\n\n\n

https:\/\/www.geeksforgeeks.org\/difference-between-pca-vs-t-sne\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"

Artigo escrito por Pedro Interaminense Em diversos problemas que envolvem Intelig\u00eancia Artificial e grandes volumes de dados, podem surgir problemas relacionados ao tempo de processamento, uso excessivo de recursos computacionais e modelos menos precisos. No entanto, com o avan\u00e7o da minera\u00e7\u00e3o de dados, \u00e9 poss\u00edvel lidar com esses problemas utilizando t\u00e9cnicas de redu\u00e7\u00e3o de dimensionalidade, …<\/p>\n

Dimensionalidade<\/span> Leia mais »<\/a><\/p>\n","protected":false},"author":11,"featured_media":27310,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"","site-content-layout":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","spay_email":"","footnotes":""},"categories":[443],"tags":[308,543,213,214,547,548,115,546,545],"class_list":["post-26941","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ciencia-de-dados","tag-analise-de-componentes-principais","tag-dimensionalidade","tag-ia","tag-inteligencia-artificial","tag-mineracao","tag-mineracao-de-dados","tag-pca","tag-t-sne","tag-truncated-svd"],"yoast_head":"\nDimensionalidade - Statplace<\/title>\n<meta name=\"description\" content=\"Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA, t_SNE, Truncated SVD.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/\" \/>\n<meta property=\"og:locale\" content=\"pt_BR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Dimensionalidade - Statplace\" \/>\n<meta property=\"og:description\" content=\"Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA, t_SNE, Truncated SVD.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/\" \/>\n<meta property=\"og:site_name\" content=\"Statplace\" \/>\n<meta property=\"article:published_time\" content=\"2023-03-01T14:17:24+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-10-04T17:57:59+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ana Luiza\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. tempo de leitura\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/statplace.com.br\/#website\",\"url\":\"https:\/\/statplace.com.br\/\",\"name\":\"Statplace\",\"description\":\"A Estat\u00edstica ao alcance de todos.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/statplace.com.br\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"pt-BR\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#primaryimage\",\"inLanguage\":\"pt-BR\",\"url\":\"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png\",\"contentUrl\":\"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png\",\"width\":1920,\"height\":1080},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#webpage\",\"url\":\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/\",\"name\":\"Dimensionalidade - Statplace\",\"isPartOf\":{\"@id\":\"https:\/\/statplace.com.br\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#primaryimage\"},\"datePublished\":\"2023-03-01T14:17:24+00:00\",\"dateModified\":\"2024-10-04T17:57:59+00:00\",\"author\":{\"@id\":\"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c\"},\"description\":\"Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA, t_SNE, Truncated SVD.\",\"breadcrumb\":{\"@id\":\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#breadcrumb\"},\"inLanguage\":\"pt-BR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"In\u00edcio\",\"item\":\"https:\/\/statplace.com.br\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Dimensionalidade\"}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c\",\"name\":\"Ana Luiza\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/statplace.com.br\/#personlogo\",\"inLanguage\":\"pt-BR\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g\",\"caption\":\"Ana Luiza\"},\"url\":\"https:\/\/site.statplace.com.br\/blog\/author\/analuiza\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Dimensionalidade - Statplace","description":"Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA, t_SNE, Truncated SVD.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/","og_locale":"pt_BR","og_type":"article","og_title":"Dimensionalidade - Statplace","og_description":"Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA, t_SNE, Truncated SVD.","og_url":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/","og_site_name":"Statplace","article_published_time":"2023-03-01T14:17:24+00:00","article_modified_time":"2024-10-04T17:57:59+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Escrito por":"Ana Luiza","Est. tempo de leitura":"6 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebSite","@id":"https:\/\/statplace.com.br\/#website","url":"https:\/\/statplace.com.br\/","name":"Statplace","description":"A Estat\u00edstica ao alcance de todos.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/statplace.com.br\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"pt-BR"},{"@type":"ImageObject","@id":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#primaryimage","inLanguage":"pt-BR","url":"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png","contentUrl":"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png","width":1920,"height":1080},{"@type":"WebPage","@id":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#webpage","url":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/","name":"Dimensionalidade - Statplace","isPartOf":{"@id":"https:\/\/statplace.com.br\/#website"},"primaryImageOfPage":{"@id":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#primaryimage"},"datePublished":"2023-03-01T14:17:24+00:00","dateModified":"2024-10-04T17:57:59+00:00","author":{"@id":"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c"},"description":"Neste artigo vamos abordar de forma conceitual e pr\u00e1tica tr\u00eas algoritmos de Redu\u00e7\u00e3o de dimensionalidade: PCA, t_SNE, Truncated SVD.","breadcrumb":{"@id":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#breadcrumb"},"inLanguage":"pt-BR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/statplace.com.br\/blog\/dimensionalidade\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/statplace.com.br\/blog\/dimensionalidade\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"In\u00edcio","item":"https:\/\/statplace.com.br\/"},{"@type":"ListItem","position":2,"name":"Dimensionalidade"}]},{"@type":"Person","@id":"https:\/\/statplace.com.br\/#\/schema\/person\/ef3d5be89e042e1e85f646c16a03625c","name":"Ana Luiza","image":{"@type":"ImageObject","@id":"https:\/\/statplace.com.br\/#personlogo","inLanguage":"pt-BR","url":"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d7a8eaeb525fdabafdd76fb8b8db4334?s=96&d=mm&r=g","caption":"Ana Luiza"},"url":"https:\/\/site.statplace.com.br\/blog\/author\/analuiza\/"}]}},"jetpack_featured_media_url":"https:\/\/site.statplace.com.br\/wp-content\/uploads\/2023\/02\/capa-blog.png","_links":{"self":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts\/26941","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/comments?post=26941"}],"version-history":[{"count":49,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts\/26941\/revisions"}],"predecessor-version":[{"id":27966,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/posts\/26941\/revisions\/27966"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/media\/27310"}],"wp:attachment":[{"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/media?parent=26941"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/categories?post=26941"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/site.statplace.com.br\/wp-json\/wp\/v2\/tags?post=26941"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}