Masking
Masking allows you to automatically redact or transform sensitive data in your traces before they’re sent to the observatory. Masking is essential for several reasons:
- Security: Prevent exposure of credentials or sensitive business data
- Regulatory Compliance: Meet requirements like GDPR, HIPAA, or CCPA
Configure Masking
To implement masking, you need to define a masking function and configure the trace manager to use it.
Python
import re
from deepeval.tracing import observe, trace_manager
def masking_function(data):
if isinstance(data, str):
data = re.sub(r'\b(?:\d{4}[- ]?){3}\d{4}\b', '[REDACTED CARD]', data)
return data
return data
trace_manager.configure(mask=masking_function)
@observe(type="agent")
def agent(query: str):
return "4242-4242-4242-4242"
my_agent("Test Masking")
The masking function is automatically applied to:
- Span attributes: Any attributes passed to span decorators or wrappers
- Observed function I/O: all input parameters and return values of functions with
@observe
decorator.
⚠️
Since the masking function is applied to an observed function’s inputs, outputs, and span attributes, it must handle the various data types defined for each field as necessary.
Example Masking functions
Credit card number
Python
import re
from typing import Any
def mask_credit_card(data: Any) -> Any:
if isinstance(data, str):
return re.sub(r'\b(?:\d{4}[- ]?){3}\d{4}\b', '[REDACTED CARD]', data)
elif isinstance(data, list):
return [mask_credit_card(item) for item in data]
elif isinstance(data, dict):
return {k: mask_credit_card(v) for k, v in data.items()}
else:
return data
print(mask_credit_card("My card number is 4111 1111 1111 1234."))
# Output: My card number is [REDACTED CARD].
Email address
Python
import re
from typing import Any
def mask_email(data: Any) -> Any:
if isinstance(data, str):
return re.sub(r'\b([\w.%+-])[^\s@]*(@[\w.-]+\.\w+)', r'\1*****\2', data)
elif isinstance(data, list):
return [mask_email(item) for item in data]
elif isinstance(data, dict):
return {k: mask_email(v) for k, v in data.items()}
else:
return data
print(mask_email("Contact me at johndoe@example.com."))
# Output: Contact me at j*****@example.com.
Bank account number
Python
import re
from typing import Any
def mask_bank_account(data: Any) -> Any:
if isinstance(data, str):
return re.sub(
r'\b\d{6,}(?!\d)',
lambda m: '*' * (len(m.group()) - 4) + m.group()[-4:],
data
)
elif isinstance(data, list):
return [mask_bank_account(item) for item in data]
elif isinstance(data, dict):
return {k: mask_bank_account(v) for k, v in data.items()}
else:
return data
print(mask_bank_account("My account is 9876543210."))
# Output: My account is ******3210.
Passport number
Python
import re
from typing import Any
def mask_passport_number(data: Any) -> Any:
if isinstance(data, str):
return re.sub(r'\b([A-Z])([A-Z0-9]{6,8})([A-Z0-9])\b', r'\1*******\3', data)
elif isinstance(data, list):
return [mask_passport_number(item) for item in data]
elif isinstance(data, dict):
return {k: mask_passport_number(v) for k, v in data.items()}
return data
print(mask_passport_number("Passport: A12345678."))
# Output: Passport: A*******8.
Last updated on