-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
34 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
[Music] [Music] what's up guys this is Ronnie welcome back to our Channel Total technology Zone and today's topic will be very easy and here I'll be going to show you guys how to load PDF file using Lang okay so let's uh working on this thing so basically if you see here so load PDF using p PDF so basically uh we'll be going to use Python P PDF module and that will help internally to actually load the pile so basically uh we'll be going to use uh a module called P PDF loader from which is actually available within the Lang chain document loader class and we'll be going to use that P PDF loader to load the document and where each document contains the page cont content and the metad data of the page number means first we load the file and after that uh every document or every document um within that entire set is actually going to contain one f one page from the PDF and some metadata okay so to explain this thing in detail uh let me actually try to load a PDF file so basically if you here see here I have a PDF called all India and it has let's say some page numbers are there total there are total 469 pages and some informations are there I'll be going to load this page okay so let's see okay so first thing let me okay do it like from from blank chain don't know sometime it doesn't give me this thing okay L chain document loaders [Music] import [Music] sorry input Pi PDF loader this one and then we'll be going to use loader equals to Pi PDF loader okay and no need to remember it will tell you what argument it's only one argument called file path in the file path I'll be just going to write the file name india.pdf okay so it should be india.pdf so let's first load this file and see if I'm getting in any error or not as I told you this P PDF loader internally use Python P PDF module so if your system is already having this module available it will not give you any error but that if that module is not available it is going to give you an error oh interesting thing it is saying that there is nothing called Lang chain okay so maybe just hold on okay okay so now it is complaining P install P PDF is missing so I'll be going to install that so pip 3 install by PDF okay so let's see okay I'm just continuing the installation so once the installation is done I'll going to execute this thing again okay so this time no error so it means it is loading everything perfectly well okay okay fine now what we'll do we'll just try to actually uh create a docs like combination of documents so doc equals to loader do load right and after that what will happen we'll just going to write uh for Doc in docs and we can simply write simply write print doc doc and what we'll do we'll just write print just separator one so something like this so that will be easily able to separate everything every document or every page number okay as I told you every page number is actually every document is actually going to U be the page number from the PDF okay just because it has 468 lines so it will take some time okay so it's executing so this is the last page 468 if you see here so many informations are there so just to check whether um it is giving me the right information or not see there is so many informations are available so hopefully these things are basically some index okay okay right some informations are available see 432 okay okay so yes everything is fine properly no problem is there okay so now there is another way of doing actually this thing loading this thing actually so what I'll do I'll just try to uh sorry I don't know how to actually do this thing so how to do this thing here just hold on so I just want to comment this line okay so maybe CU my these things are not working I'm trying to actually do it from the keyboard but the shortcut keys are not working okay so just hold on I'm just checking how to actually do a control let me actually oh um no still it is not working okay so let me try this thing again it's control shift maybe this one no nothing is working okay so maybe I just keep it like this thing and after that I'll just put com comment maybe I can just put a comment here like this okay like this like this okay and what we'll do here we'll just do loader dot load and split basically so this is going to load the document and split into chunks chunks is written as a document okay so we'll do that way okay and then what will happen we'll be going to actually print docs if I just do this thing what will happen let's see this is also going to take some time okay yeah this is also loading everything but this entire thing is actually a list so you can easily get the first thing like this zero so this is going to give you the first page okay yes so this is giving you the first page see like this way this is going to give all the page okay so now what is actually going to happen so maybe if I just write for I in [Music] range and it will be length of this docs okay like this and I can simply write print uh docs of I Dot Page content so this is going to give all the page content one by one okay and after that we can simply just add couple of new line here slash in SL n okay so this is going to be two new line character okay so let's see okay so this is also another way of extracting uh like information from the PDF okay okay so let's see yes so it's extracting everything okay so now what I'll do I'll just try to actually uh work with some sort of images okay so this is one way of doing it so what I'll do I'll just copy and paste this thing here okay because this this requires some sort of coding get done so what we do I'll just create a duplicate slide I will delete this part load PDF files with images with image extraction okay I would like to extract images with image extraction okay okay so how to actually extract the images so we'll do something let me check in which page I have this images okay so maybe I'll go here okay slowly go here and just try to actually open this document so basically some pages Maybe will have some Doc uh images right so let me just go and check I believe I will have image in 428 pages so yeah I have a image here so what is the page number maybe it is the page number 28 okay so what I'll do I'll just create a second file and maybe image underscore extraction py right so what I do I will just copy up to this part okay okay and I will just write [Music] print uh page 28 is actually going to start from 27 maybe let's see let's see how it is going to work yeah definitely it's taking time [Music] so okay okay so it is actually saying the page 31 so maybe uh okay page 31 so let me check which page is actually it is no pointing what is the page 31 30 yeah this is a page 31 but page 31 has nothing here so this is actually page 32 let me check okay so basically it is showing me the content of the page 31 so maybe I can actually do page 24 maybe page 24 is actually going to give me something let's see I just would like to extra some images okay so basically if you see it is not going to extract the images C I believe if image extraction is actually not part of this thing yeah it is extraction extracting the images so basically page content map one so basically it is showing map one India Today Jammu Kashmir arunachal so basically it is taking the image from page 27 so let's check page 27 okay so this is this one okay so maybe this is one so this one actually is extracting okay so let me check here so yeah I think this is yeah and and with n nagaland m these things are coming okay right so this is fine so from from the uh from this thing it is also extracting this information so basically what it is doing right now uh it is actually trying to extract the image as an information but if you want to do additional um uh you have to actually mention something additionally extract images equals to true so what will happen this will make sure that it will extract all the images as a text yeah without that this is also doing but to make this thing uh clear here or flag this thing through here it is actually going to do it for sure okay so let's see but this time I'll be able to get some errors if it is saying that this module is missing so what do you have to do have to copy this thing okay this are the learning [Music] right yes if you don't try it by yourself you won't be able to actually get to this point so that is why practice is important and doing the coding by your hand is also important so practice mixes practice and mistakes will help you to become expert and expertise will always uh give you additional confidence right okay so it's done let me actually do it in this way so now see it is definitely going to give me the same information but uh this will be more useful whenever you're actually going to deal with some sensitive data or some information which uh require additional like consideration like extract images flag equals to True okay so it is saying so now actually if you see here page 25 is blank right so because of that it's not coming so let's see page 25 that is 27 26 yes so basically it is saying part I so this is blank so that is why it is not coming okay so I'll be going to do this and this so basically after two page so maybe I can simply write 26 this time this is going to give me some information so you see here without extract images flag it will skip some pages okay if it is not able to read it will skip okay so let's see now this will extract the images and it will try to actually load everything so this is now page number 28 and it's doing the right thing okay so page 28 means it will be actually from page 27 okay so think I think this one sorry this is the page maybe page 28 okay so let's see h okay okay think this is not the one I'm looking for this is actually showing a different thing so maybe let me check which page actually is trying to show uh yes I think this is the one yes Mount Everest I think yes I think this is actually showing this part indas Ravi base so let's check okay bambra here yes okay I I think this is the one yes right so this is actually showing this image okay from page okay page 29 actually okay yes so yeah this 500 Miles 500 kilomet and so many things are there yes so basically this is the extraction of the text from that image so it is not useful at this moment but still you can actually get all the context out of your information out of your images if you actually in enable this flag equals to true so as I told you at this part we are not actually building any application we're just learning what are the different components are available within the lanen document loader class and how to actually use the different loader uh like uh uh different types of loaders like csb uh PDF Excel unstructured MD python lots of different things are there so in this tutorial we learn how to actually ex how to load PDF file how to extract information information from the PDF file and how to extract uh text based information from the images within your PDF so all these things are clear right now right so I believe you guys can actually try out this thing by your own and try to actually develop something interesting in the future and please let me know VI some sort of feedback whether you guys are actually able to do something from this context or you think that this is completely out of context like whatever you're trying to do if you if you if you if most of you are not able to find any context out of this tutorial so just let us know by some sort of comment so I'll try to actually create some real use case based because my objective was to actually uh do not concentrate on the uh Foundation part and the use case part together in this section actually I'll be going to give you guys little bit of uh little bit of idea about the different types of document loader and maybe in the future or upcoming videos I'll be going to tackle some or Target some sort of like real use cases where I'll be going to use the the functionalities from this different type of loader and try to build some real application okay so that is why I just separate these two things uh from each other so that you guys won't get confused and it will not going to be kind of a very complex kind of a tutorial for all of you okay so that's it so let me know your feedback and I'll accordingly proceed with the next videos right so before I conclude guys please try to subscribe to our Channel hit the like button if you're really enjoying our videos and also try to hit the Bell notification if you want to get future updates from our site and also try to share our videos with your family or friend whoever uh you think we'll get some sort of benefit out of this video and if you're also watching our Channel or came to our playlist for the first time just try to actually subscribe to our Channel and try to watch this playlist from the beginning because this is 27th tutorial and before that we already posted 26 tutorial so if your plan is actually to do something really fantastic in the area of artificial intelligence AI or machine learning then you should actually watch this tutorial from the beginning we have started this tutorial uh uh uh by considering that all of you are actually beginner and we started this tutorial from Lin uh completely from the uh Ground Zero level and this are actually a little bit intermediate tutorial but if you start watching this playlist from the beginning then you won't get any problem in the future okay so that's it we'll see you in the next video till then take care goodbye and have a nice [Music] day | ||
Summit of this kind has continued for 20 long years and going from strength to strength this is a tribute to our prime minister SRI Narendra Bai modi's vision and consistency I have been one of the fortunate few to have participated in every single edition of vibrant Gujarat I have come from the city of the Gateway of India to the Gateway of modern India's growth Gujarat I am a proud [Applause] Gujarati when foreigners think of New India they think of a new Gujarat NAU Gujarat how did this transformation happen because of one leader our beloved leader who has emerged as the greatest Global leader of our times and he is Sri Narendra Modi the most successful prime minister in India's history about Narendra Bai when you speak the whole world not only listens but applauds you my friends abroad ask me what is the meaning of slogan that millions of Indians are chanting that Modi toin I tell [Applause] [Applause] them most respected SRI Narendra Modi G when you speak the whole world not only listens but applauds you my friends abroad ask me what is the meaning of the slogan millions of Indians are chanting Modi toin I tell them it means India's prime minister makes the impossible possible with his vision determination and execution they agree and they also say mod my esteemed friends I will never forget what my father dirai Amani used to tell me in my childhood Gujarat is your matumi and Gujarat should always remain your Kumi today let me declare yet again Reliance was is and will always remain a Gujarati company each of reliance's business is striving to fulfill the dreams of my 7 CR fellow gujaratis Reliance has invested over $150 billion that is 12 lakh crores in in creating worldclass assets and capacities across India in the last 10 years of this more than onethird has been invested in Gujarat alone honorable chief minister S bendra Patel G today I would like to make five commitments to the people of Gujarat First Reliance will continue to play a leading role in gujarat's growth story with significant investments in the next 10 years specifically Reliance will contribute to making Gujarat a global leader in green growth we will help Gujarat Target to meet half of its energy need through renewable energy by the year 2030 for this we have started building the dirai Amani green energy Giga comp complex over 5,000 acres in jamnagar this will generate a large number of green jobs and enable production of green products and materials and will make Gujarat a leading exporter of green products and we are ready to commission this in the second half of 2024 itself second Reliance Geo completed the fastest roll out of 5G infrastructure anywhere in the world today Gujarat is fully 5G enabled something that most of the world does not yet have this will make Gujarat a global leader in Digital Data platforms and AI adoption 5G enabled AI Revolution will make gujarat's economy more productive more efficient and more globally competitive besides generating millions of new employment opportunities it will produce AI enabled doctors AI enabled teachers and AI enabled farming which will revolutionize Health Care education and agricultural productivity in the state of Gujarat this will benefit every Gujarati in urban as well as rural areas since to my mind AI also means an allinclusive growth third Reliance retail will further accelerate its mission to bring quality products to Consumers and simultaneously Empower lacks of kissan and small Merchants our retail business improves the quality of life of all households of Gujarat with better products and services fourth Reliance will make Gujarat a Pioneer in new materials and the circular economy as a first step Reliance is setting up India's first and world class carbon fiber facility at haera and finally fifth Prime Minister Modi G has announced that India will bid for 2036 Olympics in preparation for that Reliance and Reliance foundation with will join forces with several other partners in Gujarat to improve education Sports and skills infrastructure that will nurture the champions of tomorrow in various Olympic sports most respected prime minister Narendra Modi G before I conclude permit me to articulate a conviction that resides in the heart of everyone in this Hall and most Indians as chief minister of Gujarat you used to say Gujarat Vias and that is how you made Gujarat India's growth engine now as prime minister of India your mission is you are working on the Mantra of global good and make India the world's growth engine the story of your journey from Gujarat to the global stage in just two decades is nothing short of a modern epic today and in today's India is the best time for young people to enter the economy to innovate and to provide ease of living and ease of earning to hundreds of millions of people the coming generation ations will indeed be thankful to Prime Minister Modi for being both a nationalist and an internationalist you have laid a solid foundation for vixit bhat India as a fully developed nation in amital no power on Earth can stop India from becoming a $35 trillion economy by 2047 and as I see Gujarat alone will become a three 3 trilon ion dollar economy therefore I am confident every Gujarati is confident and every Indian is confident that the Modi era will take India to new Summits of prosperity progress and Glory thank you very much sir J Gujarat J J gar Gujarat Jay Hind |
Oops, something went wrong.