Quick Start
How to get started using Phirestream.
The safest way to manage sensitive information in your systems is to apply safeguards before the the sensitive information can enter your systems. Phirestream works in front of Apache Kafka to redact sensitive information such as Protected Health Information (PHI) and Personally Identifiable Information (PII) from your streams before the sensitive information is published to an Apache Kafka topic.

Launching Phirestream

Use one of the following links to launch Phirestream in your cloud.
Step through your cloud provider's steps for launching Phirestream in your cloud. Once Phirestream has been launched and its virtual machine is running, you can continue with this guide below to configure Phirestream.
Phirestream can be used with self-managed Apache Kafka clusters and managed hosting services, such as Amazon MSK, Confluent Cloud, and Instaclustr.

Configuring Phirestream

With Phirestream now running we can configure it. Here we configure how Phirestream listens for incoming data for redaction and the details of the downstream Apache Kafka brokers.
Open the Phirestream configuration file at /opt/phirestream/config/application.properties. Set the value of the kafka.bootstrap.servers property to the location of your Apache Kafka broker(s). Use the command below to restart Phirestream to make the change to take affect. (For a full list of the available Phirestream settings see Settings.)
1
sudo systemctl restart phirestream
Copied!
Once Phirestream restarts we are now ready to publish and redact text. Phirestream's API endpoint is accessible at https://phirestream:8080/, where phirestream is the IP or DNS name of the Phirestream virtual machine.

Using Phirestream to Redact Text

The following command will publish a single message to Phirestream. In this request, the text George Washington was president is being published to the Apache Kafka topic mytopic.
Phirestream implements Apache Kafka's REST API interface. This means that Phirestream can be a drop-in solution for redacting text in your streaming data pipelines.
1
curl -k -X POST \
2
https://localhost:8080/topics/default \
3
-H 'Content-Type: application/vnd.kafka.json.v2+json' \
4
-d '{
5
"records": [
6
{
7
"key": "key-1",
8
"value": "George Washington was president."
9
},
10
]
11
}'
Copied!

Consuming the Redacted Text

Now, we will use Apache Kafka to consume from the mytopic topic to get the redacted message:
1
kafka-console-consumer.sh \
2
--topic default \
3
--bootstrap-server localhost:9092 \
4
--from-beginning
Copied!
The output of the command is a single message with the following contents:
{{{REDACTED-entity}}} was president.
You are now ready to redact more streaming text with Phirestream!

Summary

In this example we can see that Phirestream received the request, redacted the person's name as sensitive information, and published the modified data to Apache Kafka.
The types of sensitive information that are identified by Phirestream are defined in files called filter profiles. A filter profile specifies the types of sensitive information and how to redact those types. Phirestream selects which filter profile to apply based on the name of the Apache Kafka topic. In the example above, the topic name was default so the filter profile named default was applied. Learn more about filter profiles.
You are now ready to begin using Phirestream to manage sensitive information in your streaming text!
Last modified 27d ago