-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semana 11, pandas #16
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,43 @@ | ||||||
import pandas as pd | ||||||
|
||||||
#['Track', 'Album Name', 'Artist', 'Release Date', 'ISRC','All Time Rank', 'Track Score', 'Spotify Streams','Spotify Playlist Count', 'Spotify Playlist Reach','Spotify Popularity', 'YouTube Views', 'YouTube Likes', 'TikTok Posts','TikTok Likes', 'TikTok Views', 'YouTube Playlist Reach','Apple Music Playlist Count', 'AirPlay Spins', 'SiriusXM Spins','Deezer Playlist Count', 'Deezer Playlist Reach','Amazon Playlist Count', 'Pandora Streams', 'Pandora Track Stations','Soundcloud Streams', 'Shazam Counts', 'TIDAL Popularity','Explicit Track'] | ||||||
|
||||||
df_musicas = pd.read_csv ('../../material/mais_ouvidas_2024.csv') | ||||||
|
||||||
print(df_musicas.head()) # mostra a "cabeça" do dataframe | ||||||
print(df_musicas.columns) # mostra todas as colunas | ||||||
|
||||||
# 2 - Indentifique as colunas que contêm números, como 'Spotify Streams', 'YouTube Views', etc., e converta essas colunas para o tipo numérico se estiverem em outro formato. (Use replace() e astype()) | ||||||
|
||||||
colunas = ['Track', 'Album Name', 'Artist', 'Release Date', 'ISRC','All Time Rank', 'Track Score', 'Spotify Streams','Spotify Playlist Count', 'Spotify Playlist Reach','Spotify Popularity', 'YouTube Views', 'YouTube Likes', 'TikTok Posts','TikTok Likes', 'TikTok Views', 'YouTube Playlist Reach','Apple Music Playlist Count', 'AirPlay Spins', 'SiriusXM Spins','Deezer Playlist Count', 'Deezer Playlist Reach','Amazon Playlist Count', 'Pandora Streams', 'Pandora Track Stations','Soundcloud Streams', 'Shazam Counts', 'TIDAL Popularity','Explicit Track'] | ||||||
nulos = df_musicas.isnull() # retorna os valores nulos | ||||||
print(nulos.sum()) # soma esses valores nulos | ||||||
print(df_musicas.dtypes) | ||||||
|
||||||
for col in colunas: | ||||||
if df_musicas[col].dtypes == 'object': | ||||||
df_musicas[col] = df_musicas[col].str.replace(',' , '').astype(float, errors='ignore') | ||||||
Comment on lines
+17
to
+19
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cuidado ao utilizar você precisou utilizar o erros=ignore porque esta tentando converter colunas do tipo object que de fato são object |
||||||
|
||||||
# 3 - Corrija a coluna 'Release Date' para o formato datetime. | ||||||
|
||||||
df_musicas['Release Date'] = pd.to_datetime(df_musicas['Release Date'], format= 'mixed') | ||||||
print(df_musicas.dtypes) | ||||||
|
||||||
# 4 - Crie uma nova coluna chamada 'Streaming Popularity', que seja a média da popularidade nas plataformas 'Spotify Popularity', 'YouTube Views', 'TikTok Likes', e 'Shazam Counts'. (lembrem-se que só é possível calcular médias e fazer operações matemáticas com tipos númericos) | ||||||
|
||||||
df_musicas ['Streaming Popularity'] = df_musicas[['Spotify Popularity', 'YouTube Views', 'TikTok Likes', 'Shazam Counts']].median(axis=1) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
median() mede a mediana, no caso o que você precisa é a média 😄 |
||||||
print(df_musicas['Streaming Popularity']) | ||||||
|
||||||
# 5 - Crie uma coluna 'Total Streams', somando os valores de 'Spotify Streams', 'YouTube Views', 'TikTok Views', 'Pandora Streams', e 'Soundcloud Streams'. | ||||||
|
||||||
df_musicas ['Total Streams'] = df_musicas[['Spotify Streams', 'YouTube Views', 'TikTok Views', 'Pandora Streams','Soundcloud Streams']].sum(axis=1) | ||||||
print(df_musicas['Total Streams']) | ||||||
|
||||||
# 6 - Filtre apenas as faixas onde a popularidade do Spotify ('Spotify Popularity') é maior que 80 e que tenham mais de 1 milhão de streams totais ('Total Streams'). | ||||||
|
||||||
filtrar = df_musicas[(df_musicas['Spotify Popularity'] > 80) & (df_musicas['Total Streams'] > 1_000_000)] | ||||||
print(filtrar.head()) | ||||||
|
||||||
# 7 - Salve o DataFrame resultante em um novo arquivo JSON chamado 'faixas_filtradas.json'. - Garanta que o arquivo foi salvo corretamente | ||||||
|
||||||
filtrar.to_json('./faixas_filtradas.json', index= False) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import pandas as pd | ||
|
||
['TransactionID', 'Date', 'MobileModel', 'Brand', 'Price', 'UnitsSold','TotalRevenue', 'CustomerAge', 'CustomerGender', 'Location','PaymentMethod'] | ||
|
||
df = pd.read_csv("../../material/mobile_sales.csv") | ||
|
||
print(df.head()) | ||
print(df.columns) | ||
df_valores_nulos = df.isnull() | ||
print(df_valores_nulos.sum()) | ||
print(df.duplicated().sum) | ||
print(df.dtypes) | ||
|
||
df["Date"] = pd.to_datetime(df["Date"], format= 'mixed') | ||
print(df.dtypes) | ||
|
||
print(df["Date"]) # mostra os dados da coluna selecionada | ||
print('Date') | ||
|
||
df["Total Sales Value"] = df["Price"] * df["UnitsSold"] # Cria uma nova coluna com o título Total Sales Value através do produto de Price x UnitsSold | ||
|
||
print(df["Total Sales Value"]) # print a nova coluna | ||
|
||
print(df.columns) | ||
|
||
profit_per_product = 0.30 | ||
|
||
df['Profit Margin'] = (df['Price']*profit_per_product)* df['UnitsSold'] | ||
print(df['Profit Margin']) | ||
|
||
filtered_df = df[(df["Total Sales Value"] > 100_000) & (df["Profit Margin"] > 20_000)] | ||
|
||
print(filtered_df.head()) | ||
|
||
filtered_df.to_csv("./filtered_list.csv", index=False) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
TransactionID,Date,MobileModel,Brand,Price,UnitsSold,TotalRevenue,CustomerAge,CustomerGender,Location,PaymentMethod,Total Sales Value,Profit Margin | ||
79397f68-61ed-4ea8-bcb2-f918d4e6c05b,2024-01-06,direction,Green Inc,1196.95,85,28002.8,32,Female,Port Erik,Online,101740.75,30522.225 | ||
f7e98db9-cb87-453e-8179-e48ba5443932,2024-03-07,idea,"Massey, Nicholson and Young",1498.13,70,9703.89,45,Female,Port Daryl,Debit Card,104869.1,31460.730000000003 | ||
e59a8eb1-8448-4719-8502-2c97407d0ff9,2024-01-08,free,Nelson and Sons,1333.31,79,49676.78,45,Female,East Brianstad,Online,105331.48999999999,31599.447 | ||
b5119fd6-e0d7-44ee-8f87-44f91d42de3f,2024-05-23,law,Roach-Strong,1236.37,89,26408.0,43,Female,Victorview,Credit Card,110036.93,33011.079 | ||
03d675d2-f4c7-4860-b159-f7df7142b87e,2024-07-07,special,Weaver Ltd,1418.24,99,1877.76,42,Other,Port Ericstad,Credit Card,140405.76,42121.727999999996 | ||
9b4f4a39-8512-411a-8533-2b1d99cf4e64,2024-06-24,travel,"James, Garcia and Brown",1141.24,94,98242.2,28,Female,Bellview,Cash,107276.56,32182.968 | ||
8d757a3c-6ffc-4b44-97aa-de01f6bc3b56,2024-06-06,bar,Jordan-Williams,1409.49,81,29360.24,40,Other,Alberttown,Credit Card,114168.69,34250.606999999996 | ||
8201d6e3-4f18-4911-9b28-c50627fd1640,2024-02-11,test,Walker-White,1444.64,89,12646.8,51,Female,East Kenneth,Cash,128572.96,38571.888 | ||
8499624a-1c49-4645-9c65-df13c57c87d9,2024-07-01,matter,Andrews LLC,1269.71,96,72588.74,41,Female,Powellmouth,Credit Card,121892.16,36567.648 | ||
7f36c3a2-ff43-483b-adb8-668e37d16534,2024-07-15,partner,Hebert Inc,1352.48,95,20333.88,32,Other,Crawfordville,Online,128485.6,38545.68 | ||
56ad37f3-bb16-4f08-90ba-140b27eebd4a,2024-07-01,century,"Bates, Pearson and Hardy",1245.2,81,59695.92,34,Female,Thomasfort,Cash,100861.2,30258.36 | ||
5a5ebad6-dab7-4388-9961-411136b68e27,2024-01-24,eight,Martin-Carson,1256.1,90,25697.72,26,Other,South Benjamin,Cash,113048.99999999999,33914.7 | ||
49ee7bcb-01a3-4c71-9273-49d4afd465a4,2024-03-01,son,Anderson-White,1285.81,82,27464.7,44,Female,Port Williamshire,Online,105436.42,31630.926 | ||
7ebd3c9c-21a9-48d4-802b-50b2ae3d74e6,2024-07-25,play,Cabrera-White,1358.1,83,30654.0,40,Other,New Christina,Credit Card,112722.29999999999,33816.689999999995 | ||
96845a93-75b3-4a60-b213-201181842f96,2024-04-16,skill,"White, Ford and Andrews",1360.71,84,65262.78,29,Female,Snowfurt,Online,114299.64,34289.892 | ||
5633dd9e-0ad3-455e-8b00-7236c2379e1b,2024-07-25,security,"Ware, May and Lopez",1485.6,68,51970.4,47,Female,Michaelland,Credit Card,101020.79999999999,30306.239999999998 | ||
fdf59c83-bd14-459c-a73d-1c17b86a5a94,2024-01-13,effect,"Martin, Smith and Patterson",1317.86,84,8388.09,54,Female,Davisbury,Credit Card,110700.23999999999,33210.07199999999 | ||
07602482-1535-4857-b739-94bbfbbb2ef8,2024-04-05,property,Torres Inc,1285.42,91,16868.06,58,Other,Stacyborough,Credit Card,116973.22,35091.966 | ||
74df89fb-b693-457d-a68e-e017937d9fe0,2024-03-20,middle,"Cooper, Mcclain and Cook",1276.97,85,119864.64,50,Female,West Alice,Debit Card,108542.45,32562.735 | ||
3b05d54f-0bdf-444c-a021-19e842585d6b,2024-05-28,involve,Figueroa LLC,1184.54,89,23249.54,32,Female,South Melissa,Online,105424.06,31627.217999999997 | ||
b09fb488-824a-416d-8800-bbd06fd3530e,2024-02-07,nothing,"Miller, Hill and Lawson",1278.82,91,61818.67,49,Female,Thomasview,Credit Card,116372.62,34911.78599999999 | ||
da4115cc-24fb-4ca6-835c-6eeaf8d03cf2,2024-05-27,mention,"Skinner, Ramirez and Kelley",1486.29,68,39261.45,63,Male,Herreraborough,Cash,101067.72,30320.316 | ||
97b019b5-ee7f-47bb-a9fe-8a23c0ca4b55,2024-05-11,woman,Williamson-Clay,1378.49,80,10696.96,22,Female,Christopherbury,Debit Card,110279.2,33083.759999999995 | ||
0cc43991-586c-4734-94cc-ff0163883c1f,2024-04-15,situation,Stein-Bridges,1426.72,84,94828.04,41,Male,Espinozamouth,Online,119844.48,35953.344000000005 | ||
821fa086-7197-4c69-9733-1dcc939af665,2024-04-06,sing,Cobb LLC,1178.88,89,33021.82,53,Male,Jimchester,Online,104920.32,31476.096000000005 | ||
c07fff34-92b2-42e0-982b-6dd160b071e1,2024-04-02,operation,"Duncan, Mendoza and Mcdowell",1477.14,90,66427.94,51,Other,South Brandon,Debit Card,132942.6,39882.78 | ||
57a14899-fdc3-4bc4-82fb-dee041cc5085,2024-03-29,build,Andrews-Martin,1134.19,95,55740.27,21,Male,East Brian,Credit Card,107748.05,32324.415 | ||
1bcdf294-8f88-4cdd-aa1b-5296be5466b4,2024-03-30,expect,"Jackson, White and Brown",1409.21,76,8286.95,43,Other,Jordanfurt,Credit Card,107099.96,32129.987999999998 | ||
918c1785-2c85-4757-a15e-cfddb38dd38e,2024-05-27,bad,"Mcmahon, Jones and Baker",1317.59,86,102433.02,28,Other,New Charles,Credit Card,113312.73999999999,33993.822 | ||
0835e90c-955f-47dc-87df-98c314e501a7,2024-06-25,former,Weaver-Thompson,1196.33,87,115585.69,35,Other,Brandonton,Online,104080.70999999999,31224.212999999996 | ||
d0da1b38-58fd-4b3f-85fa-d9a8f45ee603,2024-01-20,practice,Wilcox PLC,1293.55,95,85958.04,44,Male,Mclaughlinburgh,Cash,122887.25,36866.175 | ||
83f22533-6d79-48b7-9d4c-e78aadf0595a,2024-01-23,protect,"Burns, Davila and Camacho",1297.7,85,68618.72,36,Female,North Johnport,Debit Card,110304.5,33091.35 | ||
1c06d99c-d59c-47e8-83ba-38a40a00be41,2024-04-01,artist,Smith-Tucker,1294.76,99,9072.42,42,Male,Greeneview,Debit Card,128181.24,38454.372 | ||
56f4a3f9-ee2d-4c28-b836-8f183b0333b0,2024-04-05,they,"Kirby, Oneill and Carter",1345.42,94,31312.96,25,Female,Steelemouth,Online,126469.48000000001,37940.844000000005 | ||
784b0c63-1eb4-42bf-a8e1-de0d1f9bbbb2,2024-05-18,blood,Fleming Group,1465.14,80,11440.5,60,Male,Newmantown,Cash,117211.20000000001,35163.36 | ||
1cafe067-7e81-46ff-9990-579adaebc2cf,2024-07-08,senior,Jensen-Lowe,1483.91,94,12619.62,29,Female,West Susan,Online,139487.54,41846.262 | ||
077487a5-61c4-4f29-bd88-49901d7b47e7,2024-04-30,painting,Harris-Bell,1385.88,83,9700.6,41,Other,North Samuel,Online,115028.04000000001,34508.412000000004 | ||
4fe52be3-0c3e-4098-bc58-8f391cc3fb26,2024-04-29,born,Cunningham-Hawkins,1390.89,79,60684.54,28,Female,Port Fernandomouth,Credit Card,109880.31000000001,32964.093 | ||
0740c846-3424-4692-afee-088248bbfd37,2024-07-22,figure,Flowers-Erickson,1408.39,97,44118.9,30,Other,South Holly,Online,136613.83000000002,40984.149 | ||
c1dd718f-8c25-47a9-bc37-5dd6c767199e,2024-02-26,rule,"Vasquez, Roberts and Johnson",1458.6,85,6503.25,19,Other,Bowentown,Credit Card,123980.99999999999,37194.299999999996 | ||
b563bdbe-055d-41a8-8cea-06be0db9c82c,2024-05-02,possible,"Johnson, Mcconnell and May",1385.71,75,24011.36,57,Male,Sullivanmouth,Credit Card,103928.25,31178.475000000002 | ||
15b09d84-166a-4a3d-a92b-d6cddc7e46cf,2024-05-15,experience,Young Inc,1341.15,93,99642.96,55,Male,Lawsonbury,Credit Card,124726.95000000001,37418.085 | ||
6c234dd7-845d-49d8-a506-0d0525939c52,2024-05-14,most,"Weaver, Young and King",1235.36,98,22969.48,25,Female,Micheleshire,Online,121065.27999999998,36319.583999999995 | ||
173d6e2c-d2d4-4a78-9b56-8ff05194c0be,2024-07-20,thus,Anderson-Burns,1253.69,94,65396.79,23,Male,Brownburgh,Online,117846.86,35354.058000000005 | ||
26760d0b-ece5-48d9-906f-6511c119a434,2024-04-08,fine,Sampson-Kennedy,1179.65,89,50670.65,39,Other,South Christina,Credit Card,104988.85,31496.655000000002 | ||
9230af26-83a1-4066-8d7a-8f32cb65a58c,2024-06-12,industry,"Barrett, Figueroa and White",1384.31,86,64326.15,45,Other,North Jeffrey,Debit Card,119050.65999999999,35715.198 | ||
95bee8ce-4701-4a6b-8a17-a68f2e661677,2024-03-10,director,Dennis-Sanchez,1343.65,96,3301.35,19,Female,Lake Christopher,Online,128990.40000000001,38697.12 | ||
05dc416c-5c87-4726-88bb-c3f02350f9d4,2024-06-18,easy,Jones-Nguyen,1354.53,88,16047.56,63,Other,West Kayla,Credit Card,119198.64,35759.592 | ||
06cbf9e5-391e-4727-a5b0-24c93b3f88df,2024-07-03,option,"Hanson, Barron and Castillo",1110.66,93,65983.14,25,Female,Dunnland,Debit Card,103291.38,30987.414000000004 | ||
385776a8-dd02-47b3-ac81-848385c53e01,2024-01-25,face,"Hester, Lee and Kirby",1309.52,80,52684.5,55,Female,Kellyton,Cash,104761.6,31428.48 | ||
2ad868ea-e6ec-4c08-90df-28627a36cd19,2024-05-27,science,"Daniels, Rojas and Pearson",1137.5,96,14628.86,29,Male,Sheilaburgh,Online,109200.0,32760.0 | ||
21cdcebc-fa3e-413a-9702-8fbd7b1d8682,2024-05-20,plant,Thomas Ltd,1217.74,96,65193.3,45,Other,Mooreburgh,Cash,116903.04000000001,35070.912 | ||
976cc526-100f-41ea-a8fa-6beb56f959f9,2024-02-13,decision,Miller-Jordan,1251.55,95,31085.22,55,Female,Michaelhaven,Debit Card,118897.25,35669.174999999996 | ||
c4c114b1-252e-4e40-9c57-c08f2d7388bc,2024-02-01,particularly,"Myers, Wilcox and Beck",1466.37,86,28288.0,22,Male,Danielbury,Online,126107.81999999999,37832.346 | ||
c390e049-3c4f-4b58-ad23-f52c64d7768f,2024-01-18,resource,"Fox, Stevens and Bell",1114.73,91,7388.25,64,Male,East Robertahaven,Debit Card,101440.43000000001,30432.128999999997 | ||
903f5961-35c7-47b9-a1f1-5492aa15b049,2024-04-06,four,Robinson-Thompson,1223.95,95,8217.44,43,Female,East Adam,Credit Card,116275.25,34882.575 | ||
cf0ec4eb-6751-4904-9980-9cbacd679c14,2024-03-15,hope,Hamilton-Garcia,1348.6,79,44273.85,55,Female,Tannerfort,Debit Card,106539.4,31961.82 | ||
f503c272-a176-4704-9011-e47781852269,2024-03-19,huge,Allen-Mays,1349.26,83,98113.14,52,Male,Danielport,Online,111988.58,33596.574 | ||
246f33f9-10a0-4d0d-82f5-e7c2164caf37,2024-07-14,around,"Carroll, Brown and Bates",1486.13,75,2336.76,20,Female,Amybury,Online,111459.75000000001,33437.925 | ||
b4df370f-821b-43aa-a876-841b99222c0b,2024-07-01,discussion,"Santiago, Yoder and Stevens",1447.46,73,73070.87,59,Male,Andreaview,Online,105664.58,31699.374 | ||
fcf20873-f45d-4ae1-ba0a-6333c35a01f6,2024-01-23,watch,Morrison-Stanley,1424.36,79,35283.6,63,Other,Gibbston,Credit Card,112524.43999999999,33757.331999999995 | ||
41f08915-addb-4966-8628-038c479c619a,2024-01-28,challenge,Brooks Ltd,1386.69,76,28865.7,39,Male,Ronaldchester,Credit Card,105388.44,31616.532 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Se essas são todas as colunas , por que não utilizar o df.columns()?