Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of Memory issue #4

Open
Drow999 opened this issue Nov 12, 2024 · 3 comments
Open

Out of Memory issue #4

Drow999 opened this issue Nov 12, 2024 · 3 comments

Comments

@Drow999
Copy link

Drow999 commented Nov 12, 2024

Thank you for the excellent work on this project! I am currently reproducing your work and had a question regarding oom. I tried training this model using a 3090. Could this be a memory leak issue? The RAM size of the 4090 and 3090 should be the same?
image

@KyungdaePark
Copy link

I have same issue with RTX4090

@juno181
Copy link
Owner

juno181 commented Nov 14, 2024

I understand this issue arises because CPU image loading is faster than GPU processing. Additionally, the dataloading process relies on joblib. Unfortunately, I have not found a complete solution within joblib to address this issue. Reducing the number of CPU workers may help mitigate the problem. I recommend lowering the n_job argument in the following two lines.

def getTrainCameras(self, scale=1.0, shuffle=True, return_as="generator", n_job=4, return_path=False, get_img=True, job_batch_size=2):

def getTestCameras(self, scale=1.0, shuffle=True, return_as="generator", n_job=4, return_path=False, get_img=True, job_batch_size=2):

@Cranjis-McB
Copy link

Cranjis-McB commented Nov 15, 2024

Okay. replacing getTrainCameras() and getTestCameras() with this worked for me.

def getTrainCameras2(self, scale=1.0, shuffle=True, return_as="generator", return_path=False, get_img=True):
        if self.lazy_loader:
            t_cams = list(compress(self.train_cameras[scale], self.samplelist))
            t_imgs = [(i.image_path, i.resolution, i.im_scale) for i in t_cams]

            if shuffle:
                temp = list(zip(t_cams, t_imgs))
                random.shuffle(temp)
                res1, res2 = zip(*temp)
                t_cams, t_imgs = list(res1), list(res2)

            if return_path:
                return t_cams, t_imgs

            def im_reader(path, resolution, im_scale):
                ImageFile.LOAD_TRUNCATED_IMAGES = True
                return (PILtoTorch(Image.open(path), resolution)[:3, ...] / im_scale).clamp(0, 1)

            if get_img:
                if return_as == "list":
                    imgs = [im_reader(path, resolution, im_scale) for path, resolution, im_scale in t_imgs]
                    return t_cams, imgs
                else:  # Assume "generator"
                    def img_generator():
                        for path, resolution, im_scale in t_imgs:
                            yield im_reader(path, resolution, im_scale)

                    return t_cams, img_generator()
            else:
                return t_cams, None

        else:
            t_cams = list(compress(self.train_cameras[scale], self.samplelist))
            if return_path:
                t_imgs = [(i.image_path, i.resolution) for i in t_cams]
            else:
                t_imgs = [i.image for i in t_cams]

            if shuffle:
                temp = list(zip(t_cams, t_imgs))
                random.shuffle(temp)
                res1, res2 = zip(*temp)
                t_cams, t_imgs = list(res1), list(res2)

            if return_path:
                return t_cams, t_imgs

            if return_as == "list":
                return t_cams, t_imgs
            else:
                def img_iterator():
                    for img in t_imgs:
                        yield img

                return t_cams, img_iterator()

    def getTestCameras2(self, scale=1.0, shuffle=True, return_as="generator", return_path=False, get_img=True):
        if self.lazy_loader:
            t_cams = list(compress(self.test_cameras[scale], self.test_samplelist))
            t_imgs = [(i.image_path, i.resolution, i.im_scale) for i in t_cams]

            if shuffle:
                temp = list(zip(t_cams, t_imgs))
                random.shuffle(temp)
                res1, res2 = zip(*temp)
                t_cams, t_imgs = list(res1), list(res2)

            if return_path:
                return t_cams, t_imgs

            # Define image reader function
            def im_reader(path, resolution, im_scale):
                ImageFile.LOAD_TRUNCATED_IMAGES = True
                return (PILtoTorch(Image.open(path), resolution)[:3, ...] / im_scale).clamp(0, 1)

            if get_img:
                if return_as == "list":
                    imgs = [im_reader(path, resolution, im_scale) for path, resolution, im_scale in t_imgs]
                    return t_cams, imgs
                else:  # Assume "generator"
                    def img_generator():
                        for path, resolution, im_scale in t_imgs:
                            yield im_reader(path, resolution, im_scale)

                    return t_cams, img_generator()
            else:
                return t_cams, None
        else:
            t_cams = list(compress(self.test_cameras[scale], self.test_samplelist))
            if return_path:
                t_imgs = [(i.image_path, i.resolution) for i in t_cams]
            else:
                t_imgs = [i.image for i in t_cams]

            if shuffle:
                temp = list(zip(t_cams, t_imgs))
                random.shuffle(temp)
                res1, res2 = zip(*temp)
                t_cams, t_imgs = list(res1), list(res2)

            if return_path:
                return t_cams, t_imgs

            if return_as == "list":
                return t_cams, t_imgs
            else:  # Assume "generator"
                def img_iterator():
                    for img in t_imgs:
                        yield img

                return t_cams, img_iterator()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants