--- jupytext: cell_metadata_filter: all formats: md:myst main_language: python notebook_metadata_filter: all text_representation: extension: .md format_name: myst format_version: 0.13 jupytext_version: 1.16.4 kernelspec: display_name: Python 3 language: python name: python3 --- +++ {"lines_to_next_cell": 0} (memray_example)= # Memray Profiling Example Memray tracks and reports memory allocations, both in python code and in compiled extension modules. This Memray Profiling plugin enables memory tracking on the Flyte task level and renders a memgraph profiling graph on Flyte Deck. ```{code-cell} import time from flytekit import ImageSpec, task, workflow from flytekitplugins.memray import memray_profiling ``` +++ {"lines_to_next_cell": 0} First, we use `ImageSpec` to construct a container that contains the dependencies for the tasks, we want to profile: ```{code-cell} image = ImageSpec( name="memray_demo", packages=["flytekitplugins_memray"], registry="ghcr.io/flyteorg", # Use your image registry ) ``` +++ {"lines_to_next_cell": 0} Next, we define a dummy function that generates data in memory without releasing: ```{code-cell} def generate_data(n: int): leak_list = [] for _ in range(n): # Arbitrary large number for demonstration large_data = " " * 10**6 # 1 MB string leak_list.append(large_data) # Keeps appending without releasing time.sleep(0.1) # Slow down the loop to observe memory changes ``` +++ {"lines_to_next_cell": 0} Example of profiling the memory usage of `generate_data()` via the memray `table` html reporter ```{code-cell} :lines_to_next_cell: 2 @task(container_image=image, enable_deck=True) @memray_profiling(memray_html_reporter="table") def memory_usage(n: int) -> str: generate_data(n=n) return "Well" ``` +++ {"lines_to_next_cell": 0} Example of profiling the memory leackage of `generate_data()` via the memray `flamegraph` html reporter ```{code-cell} :lines_to_next_cell: 2 @task(container_image=image, enable_deck=True) @memray_profiling(trace_python_allocators=True, memray_reporter_args=["--leaks"]) def memory_leakage(n: int) -> str: generate_data(n=n) return "Well" ``` +++ {"lines_to_next_cell": 0} Put everything together in a workflow. ```{code-cell} @workflow def wf(n: int = 500): memory_usage(n=n) memory_leakage(n=n) ```