-
Notifications
You must be signed in to change notification settings - Fork 52
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
17 changed files
with
36 additions
and
18 deletions.
There are no files selected for viewing
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,22 @@ | ||
|
||
# A Box detection algorithm for tabularized data. | ||
|
||
When you are working with Optical character recognition(OCR) or any data or | ||
object recognition problem, the first thing to do is preprocessing. Here preprocessing means | ||
to extract the location where our information is located. After extracting the location, | ||
any machine algorithm will be performed on that image. | ||
- This code is used to extract data which is in tabular format using image processing techniques. | ||
- When you are working with Optical character recognition(OCR) or any data or object recognition problem, the first thing to do is preprocessing. | ||
- Here preprocessing means to extract the box where our data is located. After extracting the boxes, any OCR algorithm can be performed on those crops for recognition. | ||
 | ||
|
||
The problem arises when you have to detect objects which are located in any tables/boxes or | ||
in row-column format. If the image is like this then you have to detect boxes and extract them one by one. | ||
- The problem arises when you have to detect objects which are located in any tables/boxes or in row-column format. If the image is like this then you have to detect boxes and extract them one by one. | ||
Now it should be done accurately for all images. | ||
|
||
This algorithm helps to detect every boxes accurately and save it in a "Cropped" folder.The code is shown in box_detection.py | ||
and the test image is "41.jpg". | ||
|
||
|
||
you can see the medium blog for this code: https://medium.com/@kananvyas/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26 | ||
This algorithm helps to detect every boxes accurately and save it in a `/Output/` folder. The code is shown in `src/box_detection.py` | ||
|
||
**USAGE:** | ||
- Run `python src/box_detection.py` | ||
- You can see the output crops on `/Output` folder | ||
 | ||
 | ||
|
||
You can also read the medium article for understanding about the algorithm: https://medium.com/@kananvyas/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters