fabric/Alma.md at v1.4.81

Archives/fabric

Fork 0

mirror of https://github.com/danielmiessler/fabric synced 2024-11-08 07:11:06 +00:00

Daniel Miessler 21186097e4 Updated the Alma.md file.

2024-10-21 17:05:19 +02:00

20 KiB

Raw Permalink Blame History

Document Purpose

This document captures the SPQA policy and State for Alma Security, a security startup out of Redwood City, Ca.

This is part of the SPQA context that will be used to answer questions and create artifacts for the company, e.g., company strategy, security strategy, quarterly security reports (QSRs), project plans, recommendations on which projects to undertake, which investments to take and avoid, and other such decisions.

A major aspect of the SPQA system is the definition of the company's mission, goals, KPIs, and challenges. These shape everything within the company and thus should be used to shape the recommendations made when asked.

In addition to the clearly stated goals and other defining characteristics listed above, there will also be a streaming list of updates coming into this system using the Activity document.

Those will be changes, updates, or modifications to the direction of the company. For example, if Goal number 4 is to build a new datacenter in Boise, Idaho, but we see an update in the Activity section that says we've lost the ability to build in Boise, we should consider goal #4 out of the picture for prioritization and other decision purposes. In other words, the streaming activity log into this document should be considered updates to the core content.

Company History

Alma Security was started by Chris Meyers, who was previously at Sigma Systems as CTO and HPE as a senior security engineer.

He started the company becuase, "I saw a gap in the authentication market, where companies were only looking at one or two aspects of one's identity to do authentication. They we're looking at the whole picture and turning that into a continuous authentication story."

Company Mission

The mission of Alma Security is to ensure businesses can continuously authenticate their users using their whole selves.

Company Goals (G1 means goal 1, G2 is goal 2, etc. Treat each item (goal/kpi/etc) as half as important as the one before it.)

NOTE: Some goals are things like project rollouts which serve the higher goals. In that case they shouldn't always be considered so much lower priority because one is serving the other.

Company Goals

G1: Achieve 20% market share by January 2025
G2: Hit 10000 active customers by January 2025
G3: Hit a customer trust score of 90+% by January 2025
G4: Get churn below 5% by August 2024
G5: Launch in Europe by August 2024
G6: Launch in India by November 2024
G7: Launch Mood-monitor integration by February 2024
G8: Launch partnership with Apple Passkeys by June 2024

Company KPIs

K1: Current marketshare percentage
K2: Number of active customers
K3: Current churn percentage
K4: Launched_in_Europe (yes/no)
K4: Launched_in_India (yes/no)

Security Team Mission

SM1: Protect Alma Security's customers and intellectual property from security and privacy incidents.

Security Team Goals

SG1: Secure all customer data -- especially biometric -- from security and privacy incidents.
SG2: Protect Alma Security's intellectual property from being captured by unathorized parties.
SG3: Reach a time to detect malicious behavior of less than 4 minutes by January 2025
SG4: Ensure the public trusts our product, because it's an authentication product we can't survive if people don't trust us.
SG5: Reach a time to remediate critical vulnerabilties on crown jewel systems of less than 16 hours by August 2025
SG6: Reach a time to remediate critical vulnerabilties on all systems of less than 3 days by August 2025
SG7: Complete audit of Apple Passkey integration by February 2025
SG8: Complete remediation of Apple Passkey vulns by February 2025

Security Team KPIs (How we measure the team)

SK1: TTD: Time to detect malicious behavior (Minutes)
SK1: TTI: Time to begin investigation of malicious behavior (Minutes)
SK3: TTR-CJC: Time to remediate critical vulnerabilities on crown jewel systems (Hours)
SK3: TTR-C: Time to remediate critical vulnerabilities on all systems (Hours)
SK4: PT: Public trust score (Complete, Significant, Moderate, Minimal, Distrust, N/A)

Risk Register (The things we're most worried about)

R1: Our infrastructure security team is understaffed by 50% after 5 key people left
R2: We are not currently monitoring our external perimeter for attack surface related vulnerabilities like open ports, listening applications, unknown hosts, unknown subdomains pointing to these things, etc. We only do scans once every couple of months and we don't really have anyone to look at the results
R3: It takes us multiple days to investigate potential malicious behavior on our systems.
R4: We lack a full list of our assets, including externally facing hosts, S3 buckets, etc., which make up our attack surface
R5: We have a low public trust score due to the events of 2022.

Security Team Narrative

Background

Alma hired a new security team starting in January of 2023 and we have been building out the program since then. The philosophy and approach for the security team is to explicitly articulate what we believe the highest risks are to Alma, to deploy targeted strategies to address those risks, and to use clear, transparent KPIs to show progress towards our goals over time.

Current Risks

So our risk register looks like this:

We are understaffed by 50% after 5 key people left in 2022
Our perimeter is not being monitored for attack surface related vulnerabilities
It takes us too long to detect and start investigating malicious behavior on our systems
We do not have a full list of our assets, which makes it difficult to know what we need to protect
We have a low public trust score due to the events of 2022

Strategies

As such, our strategies are as follows:

Hire 5 more A-tier security professionals
Purchase and implement an attack surface management solution
Invest in our detection and response capabilities
Purchase an asset inventory system that integrates with our attack surface management tool
Leverage PR to share as much of our progress as possible with the public to rebuild trust

How We're Doing

We believe being transparent about our progress is key to everything, and for that reason we maintain a limited number of KPIs that we update every quarter. These metrics will not change often. They will remain consistent so that it's easy to track how we're spending our resources and the progress we're making.

Those KPIs are:

Time to detect malicious behavior
Time to start investigating malicious behavior
Time to remediate critical vulnerabilities on crown jewel systems
Time to remediate critical vulnerabilities on all systems
Our public trust score

As of DATE, our KPIs for these are currently:

GIVE CURRENT KPIs from the Activity section below

INSERT GRAPHS OF KPI PROGRESS OVER TIME HERE

Security Team Strategies

STS1: Hire 5 more A-tier security professionals
STS2: Purchase an attack surface management solution
STS3: Invest in our detection and response capabilities
STS4: Purchase an asset inventory system that integrates with our attack surface management tool

Infrastructure Notes (a basic description of our tech stack and various context around it)

We currenty have no WAF protecting our main web app but we're considering adding one
We have had some issues with S3 buckets becoming public, or being set up as public, which has lead to some close calls with customer data almost being exposed.
alma.amazon-domain.com is our primary S3 bucket that contains everything, but it's not public readable or listable
We have a root account for our AWS account that doesn't yet have 2FA on it, but we're working on fixing that within a few weeks (but it's been open for a few months)
We also use Postgres for all our databases.
Developers have root access to the all kubernetes nodes via SSH on port 45,001 using a shared developer key issued during laptop provisioning.
We're a kubernetes shop and do everything through AWS
We're logging most stuff to Cloudtrail and we kind of use guarduty, but we don't have a 24/7 team to monitor alerts and logs. We should add that to our list of challenges next time we update our overarll policy
We also have a Windows infrastructure because some key personell came from Microsoft. The DC is hosted in our head office which is in Redwood City, and anyone who works in that office (most of the 300 employees) uses that to log in when tehy start work. The domain is ALMA.
There's a domain-joined fileserver running Windows 2012 that most people use to upload new ideas and plans for new products. It uses Windows authentication from the domain.
We use a palo alto firewall with 2fa using windows authenticator tied to SSO.
The name of the AI system doing all this context creation using SPQA is Alma, which is also the name of the company.
We use Workday for HR stuff. Slack for realtime communications. Outlook 365 as a service. Sentinel One on the workstations and laptops. Servers in AWS are mostly Amazon Linux 2 with a few Ubuntu boxes that are a few years old.
We also primarily use Postgres for all of our systems.

Team

Projects

SECURITY POSTURE (To be referenced for compliance questions and security questionnaires)

July 2019 Admin accounts still not required to use 2FA. Company laptops distributed to employees, no MDM yet for device management. AWS IAM roles created for engineers, but root access still frequently used. Started basic vulnerability scanning using open-source tools. December 2019

MFA enforced for all Google Workspace accounts after a phishing attempt. Introduced ClamAV for basic endpoint protection on corporate laptops. AWS GuardDuty enabled for threat detection, but no formal incident response team. First incident response plan table-top exercise conducted, but findings not fully documented. April 2020

Migrated from Google Workspace to Office 365, with MFA enabled for all users. Rolled out SentinelOne for endpoint protection on 50% of company laptops. Implemented least-privilege access control for AWS IAM roles. First formal vendor risk management review completed for major SaaS providers. August 2020

Completed full deployment of SentinelOne across all endpoints. Implemented AWS CloudWatch for real-time alerts; however, logs still not monitored 24/7. Began encrypting all AWS S3 buckets at rest using server-side encryption. First internal review of data retention policies, started drafting data disposal policy. January 2021

Rolled out Jamf MDM for centralized management of macOS devices, enforcing encryption (FileVault) on all laptops. Strengthened Office 365 security by implementing phishing-resistant MFA using authenticator apps. AWS KMS introduced for managing encryption keys; manual key rotation policy documented. Introduced formal onboarding and offboarding processes for employee account management. July 2021

Conditional access policies introduced for Office 365, restricting access based on geography (US-only). Conducted company-wide security awareness training for the first time, focusing on phishing threats. Completed first backup and disaster recovery (DR) drill with AWS, documenting recovery times. AWS Config deployed to monitor and enforce encryption and access control policies across accounts. December 2021

Full migration to AWS for all production systems completed. Incident response playbook finalized and shared with the security team; still no 24/7 monitoring. Documented data classification policies for handling sensitive customer data in preparation for SOC 2 audit. First third-party penetration test conducted, critical vulnerabilities identified and remediated within 30 days. March 2022

Rolled out company-wide 2FA for all critical systems, including Office 365, AWS, GitHub, and Slack. Introduced AWS Secrets Manager for managing sensitive credentials, eliminating hardcoded API keys. Updated all documentation for identity and access management in preparation for SOC 2 Type 1 audit. First external vulnerability scan completed using Qualys, with remediation SLAs established. April 2022

Updated and consolidated all security policies (incident response, access control, data retention) in preparation for SOC 2 audit. Conducted tabletop exercise for ransomware response, documenting gaps in the incident response process. Implemented Just-In-Time (JIT) access for administrative privileges in AWS, reducing unnecessary persistent access. October 2022

Passed SOC 2 Type 1 audit, with recommendations to improve monitoring and asset management. Launched quarterly phishing simulations to raise employee awareness and track training effectiveness. Fully enforced encryption for all customer data in transit and at rest using AWS KMS. Extended GuardDuty to cover all AWS regions; started monitoring alerts daily. January 2023

Hired a dedicated CISO and expanded security team by 30%. Integrated continuous vulnerability scanning across all externally facing assets using Qualys. Conducted first third-party vendor risk assessment to ensure alignment with SOC 2 and internal security standards. Implemented automated patch management for all AWS EC2 instances, reducing time to deploy critical patches. July 2023

Rolled out continuous attack surface monitoring (ASM) to identify and remediate external vulnerabilities. Performed annual data retention review, ensuring compliance with SOC 2 and GDPR requirements. Conducted a disaster recovery drill for AWS workloads, achieving a recovery time objective (RTO) of under 4 hours. Completed SOC 2 Type 2 readiness assessment, with focus on improving incident response times. November 2023

Updated incident response documentation and assigned 24/7 monitoring to a third-party SOC provider. Rolled out zero-trust network architecture across the organization, removing reliance on VPN for remote access. Passed SOC 2 Type 2 audit with no major findings; recommendations included improved asset inventory tracking. Conducted full audit of access control policies and JIT access implementation in preparation for ISO 27001 certification. April 2024

Implemented AI-driven threat detection to reduce time to detect security incidents from 10 hours to under 2 hours. Completed full encryption audit across all databases, ensuring compliance with GDPR, HIPAA, and other privacy regulations. Updated employee training programs to include privacy regulations (GDPR, CCPA) and data handling best practices. Completed internal review and audit of vendor access to critical systems as part of SOC 2 compliance effort. Completed move of all AWS services to us-west-2 and us-east-1 regions for 100% us-based cloud services. October 2024

Conducted organization-wide review of data retention and disposal policies, implementing automated data deletion for expired data. Implemented continuous compliance monitoring for SOC 2, with automated alerts for deviations in access controls and encryption settings. Finalized implementation of AI-based monitoring and response systems, significantly reducing time to remediate critical vulnerabilities. Passed SOC 2 Type 2 and ISO 27001 audits with zero non-conformities, achieving full compliance across all control areas.March 2018

Personal Gmail accounts used for internal and external communication. No 2FA enabled on any accounts. AWS accounts shared with engineers, no IAM roles or formal access control policies. No centralized endpoint protection; employees use personal laptops with no security controls. No documented security policies or incident response plan. September 2018

Initiated migration from personal Gmail to Google Workspace (G Suite) for business email. Password complexity requirements introduced (minimum 8 characters). AWS root credentials still shared among team members, no MFA enabled. No formal logging or monitoring in place for AWS activity. February 2019

Completed migration to Google Workspace; no email encryption yet. Introduced a basic password manager (LastPass) but no enforcement policy. AWS CloudTrail enabled for logging, but no one is reviewing logs. First draft of the incident response plan created, but not tested. June 2019

Enforced MFA for Google Workspace admin accounts; standard user

CURRENT STATE (KPIs, Metrics, Project Activity Updates, etc.)

October 2022: Current time to detect malicious behavior is 81 hours
October 2022: Current time to start investigating malicious behavior is 82 hours
October 2022: Current time to remediate critical vulnerabilities on crown jewel systems is 21 days
October 2022: Current time to remediate critical vulnerabilities on all systems is 51 days
January 2023: Current time to detect malicious behavior is 62 hours
January 2023: Current time to start investigating malicious behavior is 72 hours
January 2023: Current time to remediate critical vulnerabilities on crown jewel systems is 17 days
January 2023: Current time to remediate critical vulnerabilities on all systems is 43 days
July 2023: Current time to detect malicious behavior is 29 hours
July 2023: Current time to start investigating malicious behavior is 41 hours
July 2023: Current time to remediate critical vulnerabilities on crown jewel systems is 12 days
July 2023: Current time to remediate critical vulnerabilities on all systems is 29 days
November 2023: Current time to start detect malicious behavior is 12 hours
November 2023: Current time to start investigating malicious behavior is 16 hours
November 2023: Current time to remediate critical vulnerabilities on crown jewel systems is 9 days
November 2023: Current time to remediate critical vulnerabilities on all systems is 17 days
February 2024: Started attack surface management vendor selection process
January 2024: Current time to start detect malicious behavior is 9 hours
January 2024: Current time to start investigating malicious behavior is 14 hours
January 2024: Current time to remediate critical vulnerabilities on crown jewel systems is 8 days
January 2024: Current time to remediate critical vulnerabilities on all systems is 12 days
March 2024: We're now remediating crits on crown jewels in less than 6 days
April 2024: We're now remediating all criticals within 11 days
July 2024: Criticals are now being fixed in 9 days
On August 5 we got remediation of critical vulnerabilities down to 7 days

20 KiB Raw Permalink Blame History