[{"content":"Do you know what personal data your company is actually collecting? If you can\u0026rsquo;t answer that question confidently, you\u0026rsquo;re not alone. However, in Singapore, that uncertainty is expensive. SingHealth and its data intermediary IHiS paid a $1 million fine in 2019 for losing dispensed medication records of 159,000 patients. Six years later, Marina Bay Sands paid $315,000 for exposing information on 665,496 loyalty programme members.\nIn this article we discuss what it means to collect Personally Identifiable Information (PII) and Personal Data Protection (PDP) and how to understand and protect this data within an organization.\nThe Plan # Understand your customer Understand what information you collect from the customer Build a list First, understand who your customer is. If you\u0026rsquo;re a B2C, then your customers are individual users; if you\u0026rsquo;re a B2B then you deal with other businesses. Perhaps you\u0026rsquo;re a bit of both. Start here and figure out who pays you for a product or service. For example, let\u0026rsquo;s say you\u0026rsquo;re a retail business that sells Jewelry both online and in stores. Your customers are the people buying your jewelry.\nNext, figure out what information you collect from your customers, either when they buy from you or otherwise. This could be their name, phone number, email address or address. Perhaps your marketing team collects their date of birth to target birthday promotions. Look through your sales process, online or in-store, and also the marketing efforts. Do you have a loyalty programme, what kind of campaigns do you run.\nThen make a list of the data you collected so that we can move on to the next step. In the case of our example we could end up with something like this:\nData Field Collection Point Business Purpose Required? Full Name Checkout, Loyalty Signup Order fulfillment, personalization Yes Email Address Checkout, Newsletter Order confirmations, marketing campaigns Yes Phone Number Checkout Delivery updates Review Physical Address Checkout Delivery fulfillment Yes Date of Birth (Full) Loyalty Signup Birthday campaigns Review Gender Account Creation Product segmentation Review What Data Should I Collect? # Analyze whether the data you collect is in line with the requirements Identify \u0026ldquo;Bad Ideas\u0026rdquo; From the list of data you have gathered earlier, sit down and critically think through the reasons why your business collects this information. Here are some examples:\nDate Of Birth # Why do we collect our customer\u0026rsquo;s date of birth? Are we planning to run birthday campaigns? Do we need their whole date of birth, or just the Day/Month, or even just the Month? Are we selling a product that requires proof of age? Is there another way we can find this information out? Gender # Why collect customer\u0026rsquo;s gender? Do we sell products for Men and Women? Do we segment our marketing emails so that we target products accordingly? Through this process, ask questions to uncover whether the data being collected is a bad idea. If no one in the business can answer why specific data is being collected or they tell you \u0026ldquo;We collect it just in case.\u0026rdquo; then that\u0026rsquo;s a bad idea. For example, do you need the birth year or even the day? If you only run email campaigns, why collect their mobile phone number?\nAt this point, you will have a list of data that you need to collect as a business and a set of data that you probably shouldn\u0026rsquo;t collect. With the list of data that you don\u0026rsquo;t need, build out a separate plan to stop collecting and storing that data. Then move on.\nHere\u0026rsquo;s an example of a completed data collection audit for our jewelry store:\nHere is an Excel sheet that you can download and use for this exercise.\nWho Should I Speak To? # Speak to CPO, CMO, COO and similar roles. Make a list of the data they say that their teams collect. The three key people to speak with are your Product person, Marketing person and Operations person. These could be CPO, CMO, COO or Head of Product, Head of Marketing or Head of Operations depending on how your business hands out titles. Some non-tech native businesses may have a Digital transformation role. Have them take you through the flows for:\nA Sale Customer Service interaction Loyalty Signup Marketing Signup Marketing Campaign At each stage, note down the information that the customer has to part with (during a sale or self service) and what information is used (for marketing).\nHow Do I Trust This Information? # Verify the accuracy of the data collected by sampling Database servers Consider promoting shadow satellite systems to fully supported ones in the org. You are going to have to cross-reference what the stakeholder told you about the data with what is actually in the systems you run. This means looking where the data is stored. Typically this will be a database like MySQL, Postgres, or MongoDB. The DB will also usually be attached to a dashboard or a product of some sort (could be third party). Get someone on the engineering side to give you a list of all databases and tables in each database server.\nHere are some examples of how you can explore a database (note that these examples are MySQL specific). This first one shows you a list of all the tables in a specific database (in this case jewelry_store) and how many rows they have as well as when they were created.\nSELECT TABLE_NAME, TABLE_TYPE, ENGINE, TABLE_ROWS, CREATE_TIME FROM information_schema.TABLES WHERE TABLE_SCHEMA = \u0026#39;jewelry_store\u0026#39;; +------------+------------+--------+------------+---------------------+ | TABLE_NAME | TABLE_TYPE | ENGINE | TABLE_ROWS | CREATE_TIME | +------------+------------+--------+------------+---------------------+ | customers | BASE TABLE | InnoDB | 10 | 2026-01-28 09:58:22 | | orders | BASE TABLE | InnoDB | 15 | 2026-01-28 09:58:22 | +------------+------------+--------+------------+---------------------+ As you can see, this specific database has two tables named customers and orders.\nHere is how you would list the fields in the customers table:\nDESCRIBE customers; +----------------+----------------------------------+------+-----+---------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------------+----------------------------------+------+-----+---------------------+----------------+ | customer_id | int(11) | NO | PRI | NULL | auto_increment | | first_name | varchar(50) | NO | | NULL | | | last_name | varchar(50) | NO | | NULL | | | email | varchar(100) | YES | | NULL | | | phone | varchar(20) | YES | | NULL | | | address | varchar(200) | YES | | NULL | | | city | varchar(50) | YES | | NULL | | | state | varchar(50) | YES | | NULL | | | zip_code | varchar(10) | YES | | NULL | | | customer_type | enum(\u0026#39;online\u0026#39;,\u0026#39;in-store\u0026#39;,\u0026#39;both\u0026#39;) | NO | | NULL | | | loyalty_points | int(11) | YES | | 0 | | | created_at | timestamp | YES | | current_timestamp() | | +----------------+----------------------------------+------+-----+---------------------+----------------+ To take a look at the data in each table, you can query a small number of rows:\nSELECT * FROM orders LIMIT 3; +----------+-------------+---------------------+------------+------------------------+---------------+-----------------+----------+----------+------------+-------------+----------------+-----------+ | order_id | customer_id | order_date | order_type | item_name | item_category | metal_type | gemstone | quantity | unit_price | total_price | payment_method | status | +----------+-------------+---------------------+------------+------------------------+---------------+-----------------+----------+----------+------------+-------------+----------------+-----------+ | 1 | 1 | 2024-12-15 10:30:00 | online | Diamond Solitaire Ring | rings | 14k White Gold | Diamond | 1 | 2499.99 | 2499.99 | credit_card | delivered | | 2 | 2 | 2024-12-18 14:45:00 | in-store | Pearl Strand Necklace | necklaces | Sterling Silver | Pearl | 1 | 899.00 | 899.00 | cash | completed | | 3 | 3 | 2024-12-20 09:15:00 | online | Sapphire Stud Earrings | earrings | 18k Yellow Gold | Sapphire | 1 | 1250.00 | 1250.00 | paypal | shipped | +----------+-------------+---------------------+------------+------------------------+---------------+-----------------+----------+----------+------------+-------------+----------------+-----------+ SELECT * FROM customers LIMIT 3; +-------------+------------+-----------+--------------------------+----------+----------------+-----------+-------+----------+---------------+----------------+---------------------+ | customer_id | first_name | last_name | email | phone | address | city | state | zip_code | customer_type | loyalty_points | created_at | +-------------+------------+-----------+--------------------------+----------+----------------+-----------+-------+----------+---------------+----------------+---------------------+ | 1 | Sarah | Mitchell | sarah.mitchell@email.com | 555-0101 | 123 Oak Street | Boston | MA | 02101 | online | 450 | 2026-01-28 09:58:43 | | 2 | James | Rodriguez | j.rodriguez@email.com | 555-0102 | 456 Maple Ave | Cambridge | MA | 02139 | in-store | 1200 | 2026-01-28 09:58:43 | | 3 | Emily | Chen | emily.chen@email.com | 555-0103 | 789 Pine Road | Brookline | MA | 02445 | both | 875 | 2026-01-28 09:58:43 | +-------------+------------+-----------+--------------------------+----------+----------------+-----------+-------+----------+---------------+----------------+---------------------+ Here\u0026rsquo;s a quick summary of the commands used and what they do:\nSQL Statement Description SELECT TABLE_NAME, TABLE_TYPE, ENGINE, TABLE_ROWS, CREATE_TIME FROM information_schema.TABLES WHERE TABLE_SCHEMA = 'jewelry_store'; Lists all tables in a database (MySQL) DESCRIBE customers; Shows the table structure of fields belonging to the table customers SELECT * FROM orders LIMIT 3; Retrieves a sample of 3 rows from a table to inspect actual data. NEVER run a SELECT * on a production database without an accompanying LIMIT x A point to consider in the future as you implement a robust PII monitoring system is to think about shadow systems and data. I have seen many cases where a team only focused on a specific area of the product will cobble together little satellite systems to serve their needs. These shadow systems are not sanctioned and will usually be built quickly which means very little attention has been paid to security and reliability. They can become a risk to the business and so you have to decide how to handle them. In most cases, I\u0026rsquo;ve found that these systems provide a lot of utility.\nOne thing you can consider is to begin an audit process to formalize these systems rather than letting them live in the shadow realm. Once you\u0026rsquo;ve identified these systems, consider adding them to your risk-assessment backlog so that they get a proper audit and once they pass, there is no reason why they can\u0026rsquo;t be brought into the group of official organizational systems.\nWhat Do I Do with the Bad Ideas? # Build and evangelize a data collection policy org-wide. Stop collecting data considered a \u0026ldquo;bad idea\u0026rdquo; Phase out existing \u0026ldquo;bad idea\u0026rdquo; data slowly while watching for systemic failure From a security standpoint, you will have to make the decision on what data you do not collect. How you arrive at the decision isn\u0026rsquo;t necessarily what is important. Instead, drawing up a good organization-wide policy around the why of not-collecting certain types of data is. This will form the basis of your actions on choosing to stop collecting and deleting already collected data that you shouldn\u0026rsquo;t have. Deleting data proactively is often not a topic you read about, so if you arrive at this point, first off congratulations on taking your user data privacy more seriously. Next the process has to be planned similar to a feature deployment. The logical flow to this will look something like this:\nAlter the product source code to stop writing to and reading from those specific fields in your database. Monitor this for some time. Redacting the data in your database. For example: if the field is a date and you need to remove the year and day but not the month, set a different date for the day and month. Like if the date was 04/07/1988, then you can set it to 01/07/1970. We can use the Unix epoch (01/01/1970) day and year to redact this data. If it was a string, you can consider covering half of it with asterisks. If you collected ID number, then you can redact most of the digits. Say you had an ID number of S1844933F, you can redact it as S18****** . Again, monitor this for some time. Remove the entire field or delete it completely when you are sure that none of your systems will break if it cannot read that data. Remember that the data should be redacted at the database level. The masking should not be just a UI thing. Further, keep in mind that even redacted data can eventually be linked to a specific user through other pieces of data. Lastly, you have to address the data stored in backups as well. While I don\u0026rsquo;t recommend going into your backups and redacting the data, add an extra step in your backup/restore plan to redact the data once restored. Saying no and walking away doesn\u0026rsquo;t solve anything. The goal is working with teams to find solutions that meet business needs without unnecessary data collection. A good example of this is perhaps when a marketing team wants to collect the full date of birth. Your position should not be to say \u0026ldquo;No you can\u0026rsquo;t collect that.\u0026rdquo;, but instead steer the conversation in a manner similar to:\nYou: \u0026ldquo;Why do you want to collect the date of birth of the customer?\u0026rdquo; Marketing: \u0026ldquo;We want to run a birthday campaign.\u0026rdquo; You: \u0026ldquo;How does that work?\u0026rdquo; Marketing: \u0026ldquo;We will send them an email with a discount code on their birthday month.\u0026rdquo; You: \u0026ldquo;Ok, can you consider only collecting their birth month then? Since you won\u0026rsquo;t need day or year, I\u0026rsquo;d advise against collecting it.\u0026rdquo;\nKeeping the List Current # Build in triggers on which this entire exercise will repeat If no triggers, consider a time-based review every 3 or 6 months. As is the characteristic of technology, it won\u0026rsquo;t remain the same. Business needs change, products, tech and even people change. So the process you build requires revisiting. You have to know what your triggers are to kick off another exercise and how you can make the process easily repeatable. Some triggers can be whenever you launch a new major feature, when you begin working with a new vendor or third party provider or even at a set date every 3 or 6 months.\nClosing Thoughts #This is a process that you build, run, and maintain. Your list of data will likely never be complete, but that\u0026rsquo;s what you signed up for. The idea of even having a process like that can go a long way in greatly diminishing fines you pay to bodies like the Singapore PDPC. There is no definitive guide on how fines are quantified, but one thing that the PDPC will recognize is proactiveness in protecting user data. Similarly with other regulatory bodies, demonstrating that you take your user data seriously by building an audit practice like this can go a long way. You\u0026rsquo;re not infallible and you will make mistakes but minimizing the blast radius is what you should aim for.\n","date":"2 March 2026","permalink":"https://sheran.io/blog/before-the-breach-pii/","section":"Blog","summary":"Do you know what personal data your company is actually collecting? In this article we discuss what it means to collect Personally Identifiable Information (PII) and how to understand and protect this data within an organization.","title":"Before The Breach - PII"},{"content":"","date":null,"permalink":"https://sheran.io/blog/","section":"Blog","summary":"","title":"Blog"},{"content":"","date":null,"permalink":"https://sheran.io/tags/data-protection/","section":"Tags","summary":"","title":"Data-Protection"},{"content":"","date":null,"permalink":"https://sheran.io/","section":"Home","summary":"","title":"Home"},{"content":"","date":null,"permalink":"https://sheran.io/tags/pii/","section":"Tags","summary":"","title":"Pii"},{"content":"","date":null,"permalink":"https://sheran.io/tags/privacy/","section":"Tags","summary":"","title":"Privacy"},{"content":"","date":null,"permalink":"https://sheran.io/tags/security/","section":"Tags","summary":"","title":"Security"},{"content":"","date":null,"permalink":"https://sheran.io/tags/","section":"Tags","summary":"","title":"Tags"},{"content":"","date":null,"permalink":"https://sheran.io/tags/dns/","section":"Tags","summary":"","title":"Dns"},{"content":"As helpful as everyone wants to be about teaching Zig, it\u0026rsquo;s a hard thing to do when the language is evolving this quickly. With Zig 0.16 almost upon us, and with breaking standard library changes already landing, it felt like a good time to revisit some real code.\nZig 0.16 is inevitable I had a small Zig program that performed DNS lookups using std.net in Zig 0.15, and I decided to port it to Zig 0.16 to see how the new IO model and std.Io.net APIs actually behave in practice. This was my Zig 0.15 code:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 const std = @import(\u0026#34;std\u0026#34;); pub fn main() !void { var gpa = std.heap.DebugAllocator(.{}){}; defer std.debug.assert(gpa.deinit() == .ok); const allocator = gpa.allocator(); const hostname = \u0026#34;sheran.sg\u0026#34;; const result = try std.net.getAddressList(allocator, hostname, 0); defer result.deinit(); for(result.addrs) |r| { var writer: std.Io.Writer.Allocating = .init(allocator); defer writer.deinit(); try r.format(\u0026amp;writer.writer); const ip = try writer.toOwnedSlice(); defer allocator.free(ip); std.debug.print(\u0026#34;Ip: {s}\\n\u0026#34;,.{ip}); } } Fairly straightforward code. It does a DNS lookup and collects all IP addresses that belong to the hostname and prints it to stdout. This includes IP4 and IP6 addresses.\nIn Zig 0.16, std.net is gone. Well, not gone, but moved. This new change has also broken zig std. The move of std.net makes sense. Since net related functions were clearly IO, it has appropriately been moved to std.Io.net. Ok so how do we now do a lookup of a hostname? Ah here we are: std.Io.net.HostName.lookup()\nI looked at the Zig manual (read: source-code) and naively coded this up:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 const std = @import(\u0026#34;std\u0026#34;); const Io = std.Io; pub fn main() !void { var gpa = std.heap.DebugAllocator(.{}){}; defer std.debug.assert(gpa.deinit() == .ok); const allocator = gpa.allocator(); const hostname: Io.net.HostName = try .init(\u0026#34;sheran.sg\u0026#34;); var threaded: Io.Threaded = .init(allocator); defer threaded.deinit(); const io = threaded.io(); var elem_buf: [16]Io.net.HostName.LookupResult = undefined; var queue: Io.Queue(Io.net.HostName.LookupResult) = .init(\u0026amp;elem_buf); var cname_buf: [Io.net.HostName.max_len]u8 = undefined; Io.net.HostName.lookup(hostname, io, \u0026amp;queue, .{ .port = 0, .canonical_name_buffer = \u0026amp;cname_buf, }); while(queue.getOne(io)) |result| { switch(result) { .address =\u0026gt; { std.debug.print(\u0026#34;{any}\\n\u0026#34;,.{result}); }, .canonical_name =\u0026gt; {}, .end =\u0026gt; |e| { return e; }, } } else |_| {} } I ran it and it worked first time! I\u0026rsquo;m obviously a genius. Now, the more eagle-eyed of you may ask, \u0026ldquo;Why did you pick 16 as your elem_buf size?\u0026rdquo; (line 16) Good question. I used 16 because the source code said: Guaranteed not to block if provided queue has capacity at least 16. Then I had this nagging feeling, \u0026ldquo;Ok but what if there were 17 IPs in a DNS lookup?\u0026rdquo;\nI tested that hypothesis by adding 17 A records to a domain name I owned, ran the code and it hangs. Deadlocked. Quickly falling off that Dunning-Kruger peak now. The deadlock happens because the lookup() is trying to add a 17th element to the queue and my consumer hasn\u0026rsquo;t even started so that it can pull the elements off the queue. So lookup() waits for the queue to empty and won\u0026rsquo;t run the consumer code below. (line 25\u0026hellip;)\nIncidentally, I also checked the RFC for DNS and technically, there\u0026rsquo;s no limit to the number of A records that each domain name can have, which made me think that there could be an interesting fuzz test for code that does DNS resolutions. But that\u0026rsquo;s for another post.\nThen I remembered Andrew\u0026rsquo;s talk at Zigtoberfest \u0026lsquo;25. He spoke about the use of both async AND concurrent in the new Zig 0.16 release. My use case looked very much like it needed io.concurrent() so I tried that and it worked!:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 const std = @import(\u0026#34;std\u0026#34;); const Io = std.Io; pub fn main() !void { var gpa = std.heap.DebugAllocator(.{}){}; defer std.debug.assert(gpa.deinit() == .ok); const allocator = gpa.allocator(); var threaded: Io.Threaded = .init(allocator); defer threaded.deinit(); const io = threaded.io(); const hostname = try Io.net.HostName.init(\u0026#34;sheran.sg\u0026#34;); var elem_buf: [2]Io.net.HostName.LookupResult = undefined; var queue: Io.Queue(Io.net.HostName.LookupResult) = .init(\u0026amp;elem_buf); var name_buf: [Io.net.HostName.max_len]u8 = undefined; var lookup = try io.concurrent(Io.net.HostName.lookup, .{ hostname, io, \u0026amp;queue, .{ .port = 0, .canonical_name_buffer = \u0026amp;name_buf, } }); defer lookup.cancel(io); while(queue.getOne(io)) |res| { switch(res) { .address =\u0026gt; { var writer: std.Io.Writer.Allocating = .init(allocator); try res.address.format(\u0026amp;writer.writer); const ip_port = try writer.toOwnedSlice(); defer allocator.free(ip_port); std.debug.print(\u0026#34;Ip {s}\\n\u0026#34;,.{ip_port}); }, .canonical_name =\u0026gt; { std.debug.print(\u0026#34;Cname {s}\\n\u0026#34;,.{res.canonical_name.bytes}); }, .end =\u0026gt; |e| { return e; }, } } else |_| {} } What\u0026rsquo;s interesting here is that now, the elem_buf buffer becomes optional and you can create the queue like .init(\u0026amp;.{}) (line 16) and it will still work. With this new code the lookup will kick off concurrently and the consumer loop will start immediately after. It no longer has a problem with handling more than the buffer size allocated because the consumer will pull elements off the queue. Zig\u0026rsquo;s std.Io.Queue works very much like Go channels. Actually in his \u0026lsquo;Don\u0026rsquo;t forget to flush\u0026rsquo; talk (at 17:41), Andrew makes a comparison with Go.\nTo take it a step further, I decided to test the Zig async vs concurrent by running my code like this: zig run dnslookup.zig -fsingle-threaded which complained correctly that ConcurrencyUnavailable. I then changed my io.concurrent() to io.async() and tried it in single threaded mode again and this time there was an ominous looking error that when traced, crashed because of deadlocking. That\u0026rsquo;s because you can\u0026rsquo;t safely run both the lookup and the consumer on the same thread when the lookup can block.\n","date":"16 December 2025","permalink":"https://sheran.io/blog/porting-dns-from-zig-0.15-to-0.16/","section":"Blog","summary":"Porting DNS lookups from Zig 0.15 to 0.16, exploring std.Io.net, async vs concurrent IO, and a subtle deadlock caused by bounded queues.","title":"Porting DNS Code from Zig 0.15 to 0.16: IO, Queues, and Concurrency"},{"content":"","date":null,"permalink":"https://sheran.io/tags/zig/","section":"Tags","summary":"","title":"Zig"},{"content":"Edit your ~/.config/nvim/init.lua file and search for disable_filetypes and look for the line that looks like this:\nlocal disable_filetypes = { c = true, cpp = true }\nAdd zig = true to the list of filetypes.\nlocal disable_filetypes = { c = true, cpp = true, zig = true }\nEdit: 19 November 2025\nSo apparently, you can add comments of // zig fmt: off and // zig fmt: on to your code to selectively leave the formatting of your code alone.\n","date":"13 November 2025","permalink":"https://sheran.io/blog/disable-zig-auto-format-kickstart/","section":"Blog","summary":"How to disable Zig auto-format in Neovim with Kickstart configuration","title":"Disable Zig auto-format in Neovim and Kickstart"},{"content":"","date":null,"permalink":"https://sheran.io/tags/neovim/","section":"Tags","summary":"","title":"Neovim"},{"content":"The new Zig 0.15.1\u0026rsquo;s std.Io interface is gnarly. When you see the performance gains you can get when using these new Reader and Writer interfaces, you begin to soften on why every goddamn thing needs a buffer. Then the flushing. My God the flushing. I was stuck for two days trying to use the new tls.Client in Zig 0.15.2. Why not just use the std.http.Client you ask? I needed to get a bit more custom. To tell you the truth, the tls.Client is way too opaque and high-level for my needs, but let\u0026rsquo;s take it a step at a time.\nI\u0026rsquo;d used the resources I blogged about earlier in the week to put together this small piece of code to write to a tls server. It would do a basic HTTP GET, that\u0026rsquo;s it. Little did I know I\u0026rsquo;d take two days to figure out. Here\u0026rsquo;s the first piece of code I wrote:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 const std = @import(\u0026#34;std\u0026#34;); pub fn main() !void { var gpa = std.heap.DebugAllocator(.{}){}; defer std.debug.assert(gpa.deinit() == .ok); const allocator = gpa.allocator(); const hostname = \u0026#34;sheran.sg\u0026#34;; var conn = try std.net.tcpConnectToHost(allocator, hostname, 443); defer conn.close(); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = conn.reader(\u0026amp;read_buf); var writer = conn.writer(\u0026amp;write_buf); var orb: [8192]u8 = undefined; var owb: [8192]u8 = undefined; var bundle = std.crypto.Certificate.Bundle{}; try bundle.rescan(allocator); defer bundle.deinit(allocator); const options: std.crypto.tls.Client.Options = .{ .write_buffer = \u0026amp;owb, .read_buffer = \u0026amp;orb, .host = .{ .explicit = hostname }, .ca = .{ .bundle = bundle }, }; var tls_client: std.crypto.tls.Client = try std.crypto.tls.Client.init(reader.interface(), \u0026amp;writer.interface, options); var stdout = std.fs.File.stdout().writer(\u0026amp;.{}).interface; try tls_client.writer.print(\u0026#34;GET / HTTP/1.1\\r\\n\u0026#34;, .{}); try tls_client.writer.print(\u0026#34;Host: {s}\\r\\n\u0026#34;, .{hostname}); try tls_client.writer.print(\u0026#34;Connection: Close\\r\\n\u0026#34;, .{}); try tls_client.writer.print(\u0026#34;\\r\\n\u0026#34;, .{}); try tls_client.writer.flush(); var bytesread: usize = 0; while (true) { const read = tls_client.reader.stream(\u0026amp;stdout, .unlimited) catch |err| switch (err) { error.EndOfStream =\u0026gt; break, else =\u0026gt; |e| return e, }; bytesread += read; } try tls_client.end(); } Seemed kinda reasonable. I\u0026rsquo;d follow Andrew\u0026rsquo;s advice and not forget to flush, but this code did not work. Why? Because I didn\u0026rsquo;t flush the tls_client\u0026rsquo;s encrypted stream as well (.output) . I had only flushed the plaintext stream. So I just had to add that additional line of code and the code behaved as I wanted it to. Here is the working code:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 const std = @import(\u0026#34;std\u0026#34;); pub fn main() !void { var gpa = std.heap.DebugAllocator(.{}){}; defer std.debug.assert(gpa.deinit() == .ok); const allocator = gpa.allocator(); const hostname = \u0026#34;sheran.sg\u0026#34;; var conn = try std.net.tcpConnectToHost(allocator, hostname, 443); defer conn.close(); var read_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var write_buf: [std.crypto.tls.max_ciphertext_record_len]u8 = undefined; var reader = conn.reader(\u0026amp;read_buf); var writer = conn.writer(\u0026amp;write_buf); var orb: [8192]u8 = undefined; var owb: [8192]u8 = undefined; var bundle = std.crypto.Certificate.Bundle{}; try bundle.rescan(allocator); defer bundle.deinit(allocator); const options: std.crypto.tls.Client.Options = .{ .write_buffer = \u0026amp;owb, .read_buffer = \u0026amp;orb, .host = .{ .explicit = hostname }, .ca = .{ .bundle = bundle }, }; var tls_client: std.crypto.tls.Client = try std.crypto.tls.Client.init(reader.interface(), \u0026amp;writer.interface, options); var stdout = std.fs.File.stdout().writer(\u0026amp;.{}).interface; try tls_client.writer.print(\u0026#34;GET / HTTP/1.1\\r\\n\u0026#34;, .{}); try tls_client.writer.print(\u0026#34;Host: {s}\\r\\n\u0026#34;, .{hostname}); try tls_client.writer.print(\u0026#34;Connection: Close\\r\\n\u0026#34;, .{}); try tls_client.writer.print(\u0026#34;\\r\\n\u0026#34;, .{}); try tls_client.writer.flush(); // flush the plaintext writer try tls_client.output.flush(); // flush the encrypted writer var bytesread: usize = 0; while (true) { const read = tls_client.reader.stream(\u0026amp;stdout, .unlimited) catch |err| switch (err) { error.EndOfStream =\u0026gt; break, else =\u0026gt; |e| return e, }; bytesread += read; } try tls_client.end(); } ","date":"31 October 2025","permalink":"https://sheran.io/blog/figured-out-zig-tls-client/","section":"Blog","summary":"After a two day struggle, I finally figured out how to use Zig 0.15.2\u0026rsquo;s tls.Client to talk to some https servers. yay.","title":"I finally figured out Zig's tls.Client"},{"content":"I wasn\u0026rsquo;t as affected as others by the new Zig std.Io interface and the Writergate scandal when Zig 0.15.0 was released. But I did want to learn the interface because I was writing a TCP client/server thing. I have yet to figure out what I want it to be. On the way, I collected some links and resources that helped me better understand how to use std.Io. I offer the list of resources here to anyone else that make find themselves stuck. Perhaps one of them can help unlock something for you.\nVideo (ComputerBread) 16:56 - Great overall presentation on the interface Ziggit Post - Shows how to stream a reader to another writer Video (sphaerophoria) ~40min - Looking at how std.fs.File works Video (Andrew Kelley - Don\u0026rsquo;t forget to flush) 37:09 - How the new interface relates to low-level performance Blog Post (Karl Seguin) - Explores using the interface for networking ","date":"30 October 2025","permalink":"https://sheran.io/blog/zig-0.15-new-std-io/","section":"Blog","summary":"This is a small list of videos and articles that helped me learn the new Zig 0.15.0 std.Io interface. Sharing it so anyone else stuck may get some inspiration","title":"These helped me learn the new Zig std.Io interface"},{"content":"Are you a CTO? Did the pesky board decide that you needed to have a Pentest to prove the fiefdom you painstakingly cultivated is building instead of talking in circles? Are they bringing in a Big 4 accounting firm to spoil the month long Civilization gaming marathon you had planned? Here are a some great ideas you can use to energize the Pentesters, put the entire company and the next fundraise at risk and pass your Pentest with flying colors.\nPentesters are employed globally to kill vibes in engineering teams. Kendrick Lamar was a hardcore Haskell developer going through a painful time on a Pentest when he wrote the song \u0026ldquo;Bitch, Don\u0026rsquo;t Kill My Vibe\u0026rdquo;. Through the years, like how the Stanford Prison Experiment evolved, Pentesters have become emboldened to a point where they believe they run the show. These delusions must be dealt with swiftly. Use the techniques outlined here to demonstrate that you are an engineering leader not to be trifled with. Don\u0026rsquo;t worry, these techniques require less effort than paying down your mountain of tech debt and are far more entertaining.\n1. Discovery by Gas Light #Give the Pentesters a VPN to connect to the \u0026ldquo;internal\u0026rdquo; network. Then scatter this network full of fake hosts. Use Docker to create a network and then randomly populate this network with containers that use Honeyd to respond as if it were a real host. Randomize MAC and IP addresses and pick some descriptive names like \u0026ldquo;CustDB\u0026rdquo;, \u0026ldquo;AdminWS\u0026rdquo;,\u0026ldquo;VoIPSrvr\u0026rdquo; or \u0026ldquo;AuthServer\u0026rdquo;. Make sure to setup high latency and include some packet loss when setting up your hosts using Honeyd. This is an effective way of pacing an impatient Pentester\u0026rsquo;s -T4 scans. Pentesters love to manually tweak their Nmap scans and you can tell they\u0026rsquo;re having fun when they start pulling out --initial-rtt-timeout, --min-rtt-timeout and --max-rtt-timeout.\nYou want to have the Pentesters find these fake hosts during discovery when they\u0026rsquo;re running their Nmap scans. Script this setup to teardown and reset itself at the end of each day so that the Pentesters will not find the hosts they discovered from the day before and will have to start their discovery again. If you automate this process and can run it several times a day, you will win big because that can tackle their Zmap scans which they will resort to when their Nmap scans take too long to complete.\nSow the seeds of doubt in your CEO\u0026rsquo;s mind by saying the Pentesting budget could have been better used to take a third engineering off-site in an exotic location for the year. Say these words: \u0026ldquo;These Pentesters can\u0026rsquo;t find that exotic location with a $200,000 Google Maps Places API budget. So how can they be expected to find our internal servers?\u0026rdquo;\nEffort: 8 Entertainment: 6\nNow where were those hosts I scanned yesterday? 2. Drop the Production Database #Drop the two tables with the largest number of rows and highest IO, then take a long lunch. It doesn\u0026rsquo;t matter if you \u0026ldquo;eat lunch\u0026rdquo; at 10am. Do this when the Pentesters start their WebApp Pentesting. This will give the company a real-world opportunity to test the moat that the CEO pitched to the investors during the last fund raise. The only people other than your customers that need functioning databases is your growth team. But they\u0026rsquo;re still learning not to run nested SELECT * statements on prod during peak times, so they\u0026rsquo;re used to waiting minutes for a response anyway. They should really speak with my friend Crystal. But that\u0026rsquo;s for another time.\nCome back after 3 hours and then announce on Slack that you\u0026rsquo;re adding indexes to some tables to \u0026ldquo;improve performance\u0026rdquo; during peak times and that the \u0026ldquo;CREATE INDEX\u0026rdquo; statement has to halt all DB operations while it runs on your 4 million row table. Prime one of your engineers to speak at length about the amount of seconds you will save in a year if you shave off 10,000ns from your DB reads. Tell him his only job is to answer Slack messages and talk to the CEO because you had to handle this critical, performance enhancing task. Give him a bonus if he has word bingo on \u0026ldquo;orthogonal\u0026rdquo;, \u0026ldquo;canonical\u0026rdquo;, and \u0026ldquo;security protocols\u0026rdquo;.\nPentesters can\u0026rsquo;t abuse SQL Injection when there\u0026rsquo;s no way to run SQL. Find them and insist on \u0026ldquo;being helpful\u0026rdquo; by telling them about this cool new \u0026ldquo;front end\u0026rdquo; tcp layer your engineers wrote to not only speed up database transactions but also block SQL injection. Tell the Pentesters to \u0026ldquo;try harder\u0026rdquo;, they love hearing this.\nFind the guy in the company that they\u0026rsquo;re grooming to be the CISO and ask him in front of the CEO about his business continuity plan and why it has not kicked into action because he\u0026rsquo;s costing the company thousands in lost revenue per minute. Let him stew for a few days by not being available to discuss anything. Then swoop in and kick off a full DB restore to save the day. After about 4 days of no DB access, the Pentesters would have moved on to another phase and your CEO wouldn\u0026rsquo;t care about anything other than getting the business started back up so he can keep the investors off his ass.\nEffort: 7 Entertainment: 8.5\nWhere is your SQL Injection now? 3. Patch Early, Patch Often #Insist that the Pentesters share a daily report of their WebApp Pentest findings. Pentesters relish the thought of writing a report after a stressful day of Pentesting non-responsive WebApps and high latency port scans. It is also a psychological win for you because you force them to re-live their failures of the day.\nUse this technique to catch those stray moments when a possible SQL Injection or pesky XSS rears its ugly head. Share this report with and threaten your engineering team to come up with a fix or make the finding go away by the next day (remember that you built this team, so no threat you make to them is ever illegal). Pentesters love to uncover potential entrypoints to exploits that they can weaponize later. These ego-driven showboaters are so dramatic, they live for the moments when they can string together a bunch of smaller exploits into a bigger, more serious one. This in itself isn\u0026rsquo;t the dramatic part, thats reserved for the grandstanding during their final presentation to the CEO and the board. Unburden them of these moments by making the findings disappear as soon as they are discovered.\nInvariably they will complain to the CEO that the \u0026ldquo;Pentest is only valuable on a frozen state architecture with minimal changes.\u0026rdquo; This is the moment that you hit back at them with the line \u0026ldquo;Well, in the real world, architectures are evolving and attackers have to deal with that.\u0026rdquo; Pentesters will usually wax poetic about the \u0026ldquo;real-world\u0026rdquo; so they will appreciate it and be impressed when you use language they are familiar with.\nEffort: 5 Entertainment: 7\nI could have sworn there was an RCE there yesterday! 4. Point all your DNS entries at CIA and FBI IP addresses #This two pronged technique is both science and art. The technical part is easy, just point the DNS A records of your app and database servers to IPs within the IP blocks of the FBI and CIA. If you can\u0026rsquo;t be arsed, then just point them to the web servers of the FBI and CIA. It is vital that you only do this if there is a \u0026ldquo;remote\u0026rdquo; component to the Pentest. This means that the Pentester is conducting his scans of your infrastructure from his own home, office, or cafe. This way, the Pentester can take all the credit, that he so richly deserves, for starting a scan and leaving his laptop for it to complete. Pentesters need to practice extreme ownership in order to grow and advance in the ranks of other Pentesters. You will be helping.\nNow, the art of this technique lies in convincing not just the Pentesters, but also the company, board, and CEO that you are working together with the US authorities to catch people that want to harm human and animal kind. This is why you have volunteered to assist by changing the company servers to point to the ones operated by the CIA and FBI.\nAs for the loss of customers and revenue, repeat the scenario from technique 2 of inviting the potential CISO to take ownership of this situation. Question his lack of relationship building with authorities to keep him on his toes. Give him the number to the general hotline of the CIA and gather in a room with the CEO and board to watch as he calls them up to \u0026ldquo;take charge\u0026rdquo; of this operation. When he fails and gets yelled at by the CIA, volunteer in front of everyone to take him under your wing and help him learn how to build relationships and be on top of things but then be unavailable when he takes you up on that offer. This should get you enough time to at least play a little bit of Civilization until the CISO gets to grips with the developing situation.\nSwoop back in on day 3 and restore the DNS records. Tell the CEO and board that the authorities just did not trust the CISO and that they recommended you run an in-depth background check on him. Post on LinkedIn that you were humbled to work with the authorities to flush moles out from within the company. Blame any country that\u0026rsquo;s being vilified in the press on that week.\nEffort: 4 Entertainment: 8\nWell, shit. 5. Post on LinkedIn before the Pentest starts #Forewarned is forearmed said someone. So make sure to take the opportunity to tell everyone about the upcoming Pentest the minute the ink on the contract has dried. Post on LinkedIn, making sure you are humbled and honored, about the Pentest being conducted on your infrastructure. Make sure to provide lots of information so that any third parties that work with you can tighten up their weak APIs that you\u0026rsquo;ve been exploiting instead of actually writing the features in-house. Next, email everyone in the company and post on Slack that they are to expect a team of Pentesters in the coming days. Do not specifically tell them to increase security and be tight-lipped, but give enough clues that you know what they\u0026rsquo;re doing when no-one is watching so that they know not to speak ill of the engineering team.\nThen when the CEO proudly ushers the Pentest team into the office trying to pass them off as the \u0026ldquo;data cleaning crew\u0026rdquo;, everyone will know his treachery. This will win you more favor because you were the only one that cared enough to inform them not to print out the shared password on a banner and hang it in the office. Pentesters thrive on operating in a real-world environments, and this heightened staff paranoia is something that they will take joy in dissecting. If a networked departmental photocopier with a default password is switched off, is it really vulnerable? Have a private slack channel where people can report on what the Pentesters are doing and their whereabouts at all times.\nEffort: 1 Entertainment: 4\nI am humbled and honored to pwn these systems. As you can see, these techniques are strongly correlated to influencing the outcome of any Pentest. In the legal system, a lack of evidence is a strong indicator of innocence. Similarly, in Pentesting, inconclusive results do not indicate risk or vulnerabilities. It raises important questions about the Pentesters and their level of competence. It lowers the importance of future Pentests in the eyes of the CEO and board and it allows you to go about your work unencumbered. As your spoils of war, you also get the additional budget for Pentesting to spend on your department.\nSo if you are that CTO looking to run your next off-site at an exotic location, consider Sri Lanka. My friend and I have painstakingly built a hotel in Sri Lanka that not only has 14 rooms to accommodate your engineering team, but also a large co-working space overlooking the ocean. We\u0026rsquo;ve even got 3 meeting rooms, one large enough to house your entire board where you can discuss your next white elephant project that you and the team want to embark on. Book now, we\u0026rsquo;re open for business!\n","date":"13 July 2025","permalink":"https://sheran.io/blog/how-to-pass-a-pentest/","section":"Blog","summary":"Pentests are stressful to go through. Follow these easy steps as an engineering leader if you want to breeze through any Pentest that you find yourself facing.","title":"How to Pass a Pentest in a few easy steps!"},{"content":"","date":null,"permalink":"https://sheran.io/tags/pentest/","section":"Tags","summary":"","title":"Pentest"},{"content":"","date":null,"permalink":"https://sheran.io/tags/redteam/","section":"Tags","summary":"","title":"Redteam"},{"content":"","date":null,"permalink":"https://sheran.io/tags/webappsec/","section":"Tags","summary":"","title":"Webappsec"},{"content":"","date":null,"permalink":"https://sheran.io/tags/winning/","section":"Tags","summary":"","title":"Winning"},{"content":"","date":null,"permalink":"https://sheran.io/tags/encryption/","section":"Tags","summary":"","title":"Encryption"},{"content":"I recently needed to run an isolated VM on my Macbook. I wanted to work on the Mac where I would not leave any trace on it in case it went through a forensic analysis. So I decided to setup a Linux VM with an encrypted disk. This would allow me to operate in more or less isolation from the host Mac and not worry about disk image recovery after I delete it. Here are the instructions on how you, too, can set up a VM like this on your Mac. You will need UTM and an Apple Silicon Mac which is why all the fuss exists for ARM64.\nDownload and Install UTM #First download and install UTM on your Mac from the UTM site. Then download the Void Linux base image for arm with glibc from the site. As of writing this post, I used void-live-aarch64-20250202-base.iso\nUse the Install Script #Next, get the install script to setup the VM from this repo:\ngit clone git@github.com:sheran/voidvm-arm-efi-installer.git cd voidvm-arm-efi-installer Run a python web server in this directory so that the script is hosted on your Mac host:\npython -m http.server 8008 Create and Configure the Void Linux VM with the startup script #Then create a new VM on UTM. I used pretty much all the defaults. Of course, set your memory, CPU cores and disk size to what it is you can afford to set. Attach the image you downloaded to the CDROM and boot it up and then login to the machine with username root and password voidlinux\nNext, let\u0026rsquo;s install curl so that we can download the script we need to setup the machine:\nxbps-install -Sy curl Now, let\u0026rsquo;s get the IP address of the gateway which is also the host that is running the python server. If you used the Void Linux image and your networking was configured, run:\nip route show default And you should see something like this:\ndefault via 192.168.64.1 dev eth0 proto dhcp src 192.168.68.58 metric 1002 In this case, my default gateway is 192.168.64.1\nFinally, we run curl and pipe it to sh:\ncurl -fsSL http://192.168.64.1:8008/setup.sh | sh Replace 192.168.64.1 with whatever the IP address was from before.\nThis should kick off the entire installation process with the defaults. It will assume the disk you will install to is /dev/vda ALL data will be erased on that without confirmation. The script will also set the disk encryption and root password to password.\nChanging Disk Password #If you want to change the disk and the password to your own values, run the script like this:\ncurl -fsSL http://192.168.64.1:8008/setup.sh | DISK=\u0026#34;/dev/sda\u0026#34; CRYPT_PASSWORD=\u0026#34;s3kr3tp455\u0026#34; sh The script sets both disk encryption password and root password to the same, it should be fairly trivial to change this if you look at the script.\nOnce the installation completes, you can type poweroff to switch the machine off. Then clear the ISO image from the CDROM drive attached. Lastly, you can boot into your newly created Void Linux VM. You will be prompted to enter the decryption password as shown in the image below.\nDecryption password for the disk Now you can go ahead and add another user, install XFCE, etc. It\u0026rsquo;s not my place to tell you how to party.\nIf you run into any issues with this, feel free to go and open an issue on https://github.com/sheran/voidvm-arm-efi-installer\n","date":"26 April 2025","permalink":"https://sheran.io/blog/void-linux-arm64-luks-uefi/","section":"Blog","summary":"Here is how you can do a fairly unattended install of Void Linux ARM64 with FDE on MacOS with a UTM VM with UEFI","title":"How to Install Void Linux on MacOS with Disk Encryption"},{"content":"","date":null,"permalink":"https://sheran.io/tags/linux/","section":"Tags","summary":"","title":"Linux"},{"content":"","date":null,"permalink":"https://sheran.io/tags/lvm/","section":"Tags","summary":"","title":"Lvm"},{"content":"","date":null,"permalink":"https://sheran.io/tags/uefi/","section":"Tags","summary":"","title":"Uefi"},{"content":"","date":null,"permalink":"https://sheran.io/tags/void/","section":"Tags","summary":"","title":"Void"},{"content":"","date":null,"permalink":"https://sheran.io/tags/certificates/","section":"Tags","summary":"","title":"Certificates"},{"content":"I stand for shorter SSL certificate lifespans.\nA recent vote that was taken by all the tech giants, that assert complete and utter dominance over all of us online, unanimously agreed that we needed reduce the life expectancy of SSL certificates. As much of a pain in the ass this is, I am a huge fan. Let me tell you why:\nWe\u0026rsquo;ve used public key cryptography to prove identity, establish trust and prove legitimacy of websites since the 70s. We use a public key cryptography when we SSH into the production server while we\u0026rsquo;re on holiday at the beach and the boss needs to fix a typo. We use it to sign an email when we want to prove that we wrote it. We use it to encrypt those files that we don\u0026rsquo;t want anyone to ever see immediately after we clear our browser history. And most commonly we are aware of it when the pesky browser always tells us that we\u0026rsquo;re visiting an insecure site that we end up bypassing - the SSL certificate.\nThe SSL certificate\u0026rsquo;s primary role is to keep all data going between you and the server it is installed on encrypted. It is the reason why you type in the same password with reckless abandon when you register on every site online. If there was no SSL certificate, then anyone eavesdropping on your network, or the network that the website is on can see your username and password. Anyway, the SSL certificate will move from a maximum validity period of 398 days all the way down to 47 days. This will take place in the period between March 2026 until March 2029.\nThis announcement, coupled with Let\u0026rsquo;s Encrypt\u0026rsquo;s announcement that they will no longer send email reminders to renew your certificate, made me meander into action. I didn\u0026rsquo;t want to sign up and use a third party service so I wrote a little command line utility to check the validity of my SSL certificates. I called it expirybot and I was pleased. I wanted to just stick this into one of my server crontabs and be done with it, but then I got creative. How about rather solving just my problem I make this a bit more accessible, I thought. I therefore decided to make a GitHub Action which anyone could fork and use for their own domains. Thus was born the SSL Cert Checker GitHub workflow. I know right? The skill with which I name my projects beggars belief.\nTo monitor when your SSL certs need to be renewed, fork the repo (if you don\u0026rsquo;t mind having your monitored domains visible publicly) or just clone and push to a new private repo of yours to keep away those prying eyes. Then edit the domains.txt file in the repo and put in all your domains with SSL certificates that you want to monitor and push to your repo. The GitHub Action will run every day at 8am UTC. You can also trigger the action yourself manually. The format of the domains.txt if just one domain to monitor on each line. Optionally, you can add a comma and a number after the domain. This number is a threshold number of days. If your certificate has less than the number of threshold days left to expire, then you will get an alert. The default for this is 14 days. When a threshold is triggered, the action will create an Issue on the repo. If you subscribe to notifications, then you will get an email alert for the issue creation and then you have a full on, unattended SSL certificate validity checker for absolutely free.\nYou can obviously change the notification mechanism (I personally use the free tier of Mailgun/Sinch to send me an email directly) to either email directly, send a Slack message, etc. Whatever floats your boat. Do write me and let me know the most creative ways you get your notifications. If I particularly like yours and you happen to be in Singapore, I\u0026rsquo;ll buy you a beer.\n","date":"17 April 2025","permalink":"https://sheran.io/blog/check-ssl-cert-expiry-for-free/","section":"Blog","summary":"Do not allow your SSL Certificates to expire. Use GitHub actions to check and alert you when they are about to expire for free","title":"Check your SSL Certificate validity for free with GitHub"},{"content":"","date":null,"permalink":"https://sheran.io/tags/github/","section":"Tags","summary":"","title":"Github"},{"content":"","date":null,"permalink":"https://sheran.io/tags/ssl/","section":"Tags","summary":"","title":"Ssl"},{"content":"Update: 25 May 2025 After trying many different setups, I\u0026rsquo;d recommend you not read any furthter and use kickstart.nvim. It is very simple, a near one shot setup and works out of the box. You may have to set your zls version separately sometimes if you run into issues, but overall, I highly recommend it for a quick and easy setup.\nVideo outlining the process: https://www.youtube.com/watch?v=m8C0Cq9Uv9o\nHere is what I do to configure Neovim to work with Zig for syntax highlighting and the language server:\nSkip to just the files\nYou will need the following: # A MacOS device Neovim \u0026gt;=0.8 installed Zig installed zls version that matches the Zig version above installed tree-sitter-cli installed (can use homebrew: brew install tree-sitter) Configuration: #1. Install a Neovim package manager #I used the lazy.nvim package manager for Neovim. Install it like this: git clone --filter=blob:none https://github.com/folke/lazy.nvim.git ~/.local/share/nvim/lazy/lazy.nvim\nThen we configure Neovim to use lazy.nvim. Edit the file ~/.config/nvim/init.lua and add:\nlocal lazypath = vim.fn.stdpath(\u0026#34;data\u0026#34;) .. \u0026#34;/lazy/lazy.nvim\u0026#34; if not vim.loop.fs_stat(lazypath) then vim.fn.system({ \u0026#34;git\u0026#34;, \u0026#34;clone\u0026#34;, \u0026#34;--filter=blob:none\u0026#34;, \u0026#34;https://github.com/folke/lazy.nvim.git\u0026#34;, lazypath, }) end vim.opt.rtp:prepend(lazypath) Restart Neovim to check if the installation succeeded. You should be able to bring up the lazy.nvim interface by typing :Lazy in Neovim. Exit it by using :q\n2. Add the relevant plugins for Zig #We are going to use the nvim-lspconfig to setup zls and tree-sitter for syntax highlighting. So let\u0026rsquo;s first install those as plugins from lazy.nvim.\nAdd these lines to your ~/.config/nvim/init.lua: 1 2 3 4 5 6 7 require(\u0026#34;lazy\u0026#34;).setup({ { \u0026#34;neovim/nvim-lspconfig\u0026#34; }, { \u0026#34;nvim-treesitter/nvim-treesitter\u0026#34;, build = \u0026#34;:TSUpdate\u0026#34;, }, }) Restart your Neovim and let the plugins get installed.\n3. Configure the plugins #We only need to setup nvim-lspconfig so create a directory called lua in your ~/.config/nvim directory. Create a file lsp.lua in that directory:\n~/.config/nvim/lua/lsp.lua contents:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 local lspconfig = require(\u0026#34;lspconfig\u0026#34;) lspconfig.zls.setup { cmd = { \u0026#34;zls\u0026#34; }, filetypes = { \u0026#34;zig\u0026#34;, \u0026#34;zir\u0026#34; }, root_dir = lspconfig.util.root_pattern(\u0026#34;build.zig\u0026#34;, \u0026#34;.git\u0026#34;) or vim.loop.cwd, single_file_support = true, } vim.api.nvim_create_autocmd(\u0026#34;LspAttach\u0026#34;, { callback = function(args) local opts = { buffer = args.buf } vim.keymap.set(\u0026#34;n\u0026#34;, \u0026#34;gd\u0026#34;, vim.lsp.buf.definition, opts) -- Go to definition vim.keymap.set(\u0026#34;n\u0026#34;, \u0026#34;K\u0026#34;, vim.lsp.buf.hover, opts) -- Hover info end, }) Next we have to edit our ~/.config/nvim/init.lua to use the plugins in neovim. Add the following to your init.lua:\nrequire(\u0026#34;lsp\u0026#34;) require(\u0026#34;nvim-treesitter.configs\u0026#34;).setup { ensure_installed = { \u0026#34;zig\u0026#34; }, highlight = { enable = true }, } Restart your Neovim if you had it open and you should be all set. Here\u0026rsquo;s the full contents of all files:\nTL;DR Just the contents of the files: #~/.config/nvim/init.lua: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 local lazypath = vim.fn.stdpath(\u0026#34;data\u0026#34;) .. \u0026#34;/lazy/lazy.nvim\u0026#34; if not vim.loop.fs_stat(lazypath) then vim.fn.system({ \u0026#34;git\u0026#34;, \u0026#34;clone\u0026#34;, \u0026#34;--filter=blob:none\u0026#34;, \u0026#34;https://github.com/folke/lazy.nvim.git\u0026#34;, lazypath, }) end vim.opt.rtp:prepend(lazypath) -- Setup plugins require(\u0026#34;lazy\u0026#34;).setup({ { \u0026#34;neovim/nvim-lspconfig\u0026#34; }, { \u0026#34;nvim-treesitter/nvim-treesitter\u0026#34;, build = \u0026#34;:TSUpdate\u0026#34;, }, }) require(\u0026#34;lsp\u0026#34;) require(\u0026#34;nvim-treesitter.configs\u0026#34;).setup { ensure_installed = { \u0026#34;zig\u0026#34; }, highlight = { enable = true }, } ~/.config/nvim/lua/lsp.lua: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 local lspconfig = require(\u0026#34;lspconfig\u0026#34;) lspconfig.zls.setup { cmd = { \u0026#34;zls\u0026#34; }, filetypes = { \u0026#34;zig\u0026#34;, \u0026#34;zir\u0026#34; }, root_dir = lspconfig.util.root_pattern(\u0026#34;build.zig\u0026#34;, \u0026#34;.git\u0026#34;) or vim.loop.cwd, single_file_support = true, } vim.api.nvim_create_autocmd(\u0026#34;LspAttach\u0026#34;, { callback = function(args) local opts = { buffer = args.buf } vim.keymap.set(\u0026#34;n\u0026#34;, \u0026#34;gd\u0026#34;, vim.lsp.buf.definition, opts) -- Go to definition vim.keymap.set(\u0026#34;n\u0026#34;, \u0026#34;K\u0026#34;, vim.lsp.buf.hover, opts) -- Hover info end, }) ","date":"7 April 2025","permalink":"https://sheran.io/blog/setup-neovim-with-zig/","section":"Blog","summary":"This is what I did to get Neovim configured for Zig syntax highlighting and language server","title":"Setup Neovim With Zig"},{"content":"I had resigned myself to re-installing Nixos each time I did a system update on Macos. Recently, though, there seems to be an issue where you have to actually think and fix the issue without re-installing. The lazy bastard in me gawked at all the words in that github issue and with each swipe down the page, fought the undeniable urge to find a yak to shave. So I just typed sh \u0026lt;(curl -L https://nixos.org/nix/install) as I had done countless times over to reinstall Nixos and was hit with:\nIt seems the build group nixbld already exists, but with the UID 30000. This script can't really handle that right now, so I'm going to give up.\nLazy searching of the error message revealed someone on Stackoverflow telling someone else to delete the group and users (for the subsequent error that would also show up after deleting the group) and try again. So I did that. For all you Mac users, here\u0026rsquo;s how to delete a group:\nsudo dscl . delete /groups/nixbld\nAnd here\u0026rsquo;s how to delete the users. First, find the users to delete:\ndscl . list /users\nGaze at the list in awe and then copy all the _nixbld?? usernames to a file. The ? is a representation for any digit 0-9. Then delete them:\ncat nix_users.txt|xargs -I{} sudo dscl . delete /users/{}\nI specifically use a file so that I can\u0026rsquo;t accidentally delete users I\u0026rsquo;m not supposed to delete on my Mac. Be very aware that you can royally screw up your mac if you don\u0026rsquo;t do this correctly and accidentally paste a user account that is not _nixbld??\nThen re-run sh \u0026lt;(curl -L https://nixos.org/nix/install) and your nix-shell should work just fine.\n","date":"17 December 2024","permalink":"https://sheran.io/blog/lazy-macos-guide-to-nixos/","section":"Blog","summary":"Macos Sequoia messes with Nixos. You can either fix it by following the Github issue, or you can keep with traditions and just re-install it.","title":"Lazy Guide to re-install Nixos on Macos"},{"content":"","date":null,"permalink":"https://sheran.io/tags/macos/","section":"Tags","summary":"","title":"Macos"},{"content":"","date":null,"permalink":"https://sheran.io/tags/nixos/","section":"Tags","summary":"","title":"Nixos"},{"content":"I\u0026rsquo;ve been learning Zig for a little while, but by no means am I an expert. Karl Seguin is a prolific writer on Zig. His articles are extremely educational and I owe him a debt of gratitude for his posts. His blog, openmymind.net, is a treasure trove of knowledge on not just Zig, but other technical topics and is very much worth your time to visit and bookmark.\nAs DHH said in a recent interview with the Primeagen and Teej, and I\u0026rsquo;m paraphrashing here, \u0026ldquo;a coding language that makes you want to just write code is probably the language you would want to learn.\u0026rdquo; For me, this language has been Zig. I used it to stream-parse large files in my examination of the forensics tool Cybertriage. but didn\u0026rsquo;t get a chance to do a follow up blog post just yet. I love Zig. Besides being a really fun language to program in for me personally, it also has some really straightforward C interop:\nconst c = @cImport({ @cInclude(\u0026#34;stdio.h\u0026#34;); }); pub fn main() void { const str = \u0026#34;this is a string\u0026#34;; _ = c.printf(\u0026#34;this is a number: %d \\n\u0026#34;, @as(c_int, 42)); _ = c.printf(\u0026#34;this is a pointer: %p\\n\u0026#34;, \u0026amp;str); _ = c.printf(\u0026#34;this is the 6th char of str: %c\\n\u0026#34;, str[5]); } Here\u0026rsquo;s a contrived example of how you can use the c standard library in Zig ➜ sheran@leonov /tmp zig run ccall.zig this is a number: 42 this is a pointer: 0x10418f380 this is the 6th char of str: i ➜ sheran@leonov /tmp\nI\u0026rsquo;m writing some text processing tools and one area I had to focus on was text tokenization. For one part of the tokenization, I needed to use regex. While Zig does not have a regex component built into the standard library, it does offer some rudimentary data matching features in std.mem. Since Zig is barely 1.0 yet, it is still a language in active developent and there is a high likelihood that if the thing you want isn\u0026rsquo;t found in the language, then you\u0026rsquo;re going to have to build it yourself. When I\u0026rsquo;m learning a new language, I\u0026rsquo;d bias towards building my own features rather finding a dependencies for what I need. But build a regex library? I think not.\nThe reason I shared that snippet of code is to show how easy it is to work with c. Then I thought, if I needed a regex component, then why not use one from C? Some active searching brought me to Karl\u0026rsquo;s excellent post on Regular Expressions in Zig. In it, he shows how to use Posix\u0026rsquo;s regex.h and at the same time, points out an existing open issue with Zig wherein to use regex.h, you needed to do some prep beforehand. Now I could use regex.h, but again, since I was learning, I set myself a goal of tyring to do something not previously attempted. So I decided to use PCRE2 instead.\nI managed to figure it all out over one weekend which surprised even me. Source is below if you want to have a look. I\u0026rsquo;ve put it all in a Github repo as well which contains an interesting aside. The PCRE team have very graciously accepted a build.zig file which is in the source. So it is an absolute breeze to integrate with your Zig project. You can check it out in the repo.\nAfter we\u0026rsquo;ve added pcre2 as a dependency and built it, then we can import the header.\nconst std = @import(\u0026#34;std\u0026#34;); const re = @cImport({ @cDefine(\u0026#34;PCRE2_CODE_UNIT_WIDTH\u0026#34;, \u0026#34;8\u0026#34;); @cInclude(\u0026#34;pcre2.h\u0026#34;); }); const PCRE2_ZERO_TERMINATED = ~@as(re.PCRE2_SIZE, 0); Some C macros and defines do not expand well when called from Zig. For example, the PCRE2_ZERO_TERMINATED const being cast in the way we did.\nconst pattern : [*]const u8 = pat; var errornumber : i32 = undefined; var erroroffset : usize = undefined; const regeex = re.pcre2_compile_8( pattern, PCRE2_ZERO_TERMINATED, 0, \u0026amp;errornumber, \u0026amp;erroroffset, null); if (regeex == null){ var errormessage : [256]u8 = undefined; const msgLen : c_int = re.pcre2_get_error_message_8(errornumber, \u0026amp;errormessage, errormessage.len); std.debug.print(\u0026#34;Error compiling: {s}\\n\u0026#34;,.{errormessage[0..@intCast(msgLen)]}); return; } Then we have to compile our pattern. Here I\u0026rsquo;ve used Zig primitives instead of C types and they work fine so far. As before, there is an issue with macro expansion, however. You may notice in the code, I\u0026rsquo;ve used functions like re.pcre2_compile_8 or re.pcre2_get_error_message_8. In the C version (which you can see in the repo) the functions are plain pcre2_compile or pcre2_get_error_message without the trailing _8. The expansion happens in the pcre2.h header. Unfortunately, Zig does not support text without quotes in a C #define and it will return: error: unable to translate macro: undefined identifier 'pcre2_compile_'. Further, even if you do manage to patch the header file, you will find that Zig will not support the C concatenate operator ## and so you will eventually have to resort to calling the raw functions which are suffixed _8, _16, and _32.\nWhy these suffixes? Because that\u0026rsquo;s how PCRE2 handles UTF-8, 16, and 32 respectively. That\u0026rsquo;s why you have to define PCRE2_CODE_UNIT_WIDTH. So with the code I\u0026rsquo;ve written, I have to be extremely sure that I am feeding the function UTF-8 strings.\nconst subject : []const u8 = sub; const subject_length : usize = subject.len; const match_data = re.pcre2_match_data_create_from_pattern_8(regeex, null); const rc = re.pcre2_match_8(regeex, \u0026amp;subject[0], subject_length, 0, 0, match_data, null); if (rc \u0026lt; 0) { switch (rc) { re.PCRE2_ERROR_NOMATCH =\u0026gt; { std.debug.print(\u0026#34;No match found\\n\u0026#34;,.{}); }, else =\u0026gt; { std.debug.print(\u0026#34;Matching error: {}\\n\u0026#34;,.{rc}); } } re.pcre2_match_data_free_8(match_data); re.pcre2_code_free_8(regeex); } const ovector = re.pcre2_get_ovector_pointer_8(match_data); if (rc == 1){ std.debug.print(\u0026#34;Match found at offset: {}\\n\u0026#34;,.{ovector.*}); } else if (rc \u0026gt; 1){ for(0..@intCast(rc))|i|{ std.debug.print(\u0026#34;{}: {s}\\n\u0026#34;,.{i, subject[ovector[2 * i]..ovector[2 * i + 1]]}); } } re.pcre2_match_data_free_8(match_data); re.pcre2_code_free_8(regeex); Then we specify our haystack, or subject and call re.pcre2_match_8. This is near enough to how you would write the C equivalent. Lastly, if we find more than one match (rc \u0026gt; 1), then we loop over the rc variable to find the offsets where the match will start and end.\nYou notice that we did no Zig memory allocations and that was all handled by C. We called the C free functions similar to the original C source. This is quite fresh code so if you find better ways to do what I did, then PRs are very much welcome!\n","date":"22 September 2024","permalink":"https://sheran.io/blog/building-and-using-pcre2-in-zig/","section":"Blog","summary":"If you want to use regex in Zig, your options are limited. One way is to import existing C libraries. Here is how I used PCRE2 in my Zig code.","title":"How to build PCRE2 with Zig"},{"content":"","date":null,"permalink":"https://sheran.io/tags/pcre2/","section":"Tags","summary":"","title":"Pcre2"},{"content":"","date":null,"permalink":"https://sheran.io/tags/bruteforce/","section":"Tags","summary":"","title":"Bruteforce"},{"content":"","date":null,"permalink":"https://sheran.io/tags/dos/","section":"Tags","summary":"","title":"DoS"},{"content":"Both my EdgeRouter ER-X and EdgeRouter ER-6P exhibit a denial of service attack vector. It isn\u0026rsquo;t the sort of DoS that will bring down the entire router, but it will make the Web GUI non-responsive.\nIt all starts with the two WebSockets that are found listening on the Web GUI. These configs can be found in the /etc/lighttpd/lighttpd.conf file:\n$HTTP[\u0026#34;url\u0026#34;] =~ \u0026#34;^/ws/stats\u0026#34; { wstunnel.server = ( \u0026#34;\u0026#34; =\u0026gt; ( ( \u0026#34;socket\u0026#34; =\u0026gt; \u0026#34;/tmp/ubnt.socket.statsd\u0026#34; ) ) ) wstunnel.frame-type = \u0026#34;text\u0026#34; server.max-read-idle = 600 server.stream-request-body = 2 server.stream-response-body = 2 } $HTTP[\u0026#34;url\u0026#34;] =~ \u0026#34;^/ws/cli\u0026#34; { wstunnel.server = ( \u0026#34;\u0026#34; =\u0026gt; ( ( \u0026#34;socket\u0026#34; =\u0026gt; \u0026#34;/tmp/ubnt.socket.cli\u0026#34;) ) ) wstunnel.frame-type = \u0026#34;binary\u0026#34; server.max-read-idle = 600 server.stream-request-body = 2 server.stream-response-body = 2 } Let\u0026rsquo;s zoom in on the second one, the ws/cli entry. If I connect to this WebSocket, I will get a telnet login prompt for my router. I can then login with my credentials and configure the router if I wanted to.\nTelnet via Websocket Why does this WebSocket exist? # This WebSocket exists to allow an admin to pop open a command line window while logged in through the Web GUI. From there, he can continue to configure his router using the CLI. If SSH were not enabled and he were not directly in contact with the router, he can use this approach to configure the router using a shell.\nThe bug #There is a single binary on the router known as ubnt-util and it has many jobs. One of its jobs is to ensure that it can set up the telnet session through the WebSocket. When a connection is made to the ws/cli WebSocket, ubnt-util will open a telnet session to the telnet daemon that is running on 127.0.0.101:55523. Basically, you\u0026rsquo;re connecting from localhost to localhost. When this happens, telnetd will start the /bin/login process which prompts the user to login. After a successful login, you get your shell.\nThe problem occurs when simultaneous telnet clients are waiting for a login. Somewhere between the login session timing out with no user input and the connection closing, ubnt-util will become non-responsive. When this occurs, even though the Web GUI login page is visible and responsive, your login will not succeed and you lose access to the Web GUI. The only way to recover from this so far is to SSH into the router and kill ubnt-util. Then, the ubnt-daemon binary will restart a new ubnt-util and all is right again.\nThe sweet spot for making this happen is to send 10 simultaneous connections to the Web Socket and wait 60 seconds before closing each connection.\nHow can this be fixed? #I am waiting for an official response for remediation from Ubiquiti, but until then, I used lighttpd\u0026rsquo;s ACLs to block all remote access to this WebSocket.\n$HTTP[\u0026#34;url\u0026#34;] =~ \u0026#34;^/ws/cli\u0026#34; { $HTTP[\u0026#34;remoteip\u0026#34;] != \u0026#34;192.168.1.0/24\u0026#34; { url.access-deny(\u0026#34;\u0026#34;) } wstunnel.server = ( \u0026#34;\u0026#34; =\u0026gt; ( ( \u0026#34;socket\u0026#34; =\u0026gt; \u0026#34;/tmp/ubnt.socket.cli\u0026#34;) ) ) wstunnel.frame-type = \u0026#34;binary\u0026#34; server.max-read-idle = 600 server.stream-request-body = 2 server.stream-response-body = 2 } In the directive above, if the remote IP is not part of the local network, then deny access to this WebSocket. You can make this as narrow or as broad as you want, but the bottom line is to disallow someone from across the globe to hose your Web GUI.\nHow to test if the WebSocket is open? #You can use the tool utelnet to check for whether your EdgeRouter has the WebSocket enabled. Utelnet will check a few things including your hostname and the approximate age of the Javascript libraries used. This can give a rough estimation of the version of EdgeOS in use. If a WebSocket is found, then you will see the hostname as well as websocket: true\nBonus bug #Because this WebSocket allows you to login using telnet, you can easily write a bruteforcer to guess passwords on the router. The kicker is that none of the remote IP\u0026rsquo;s are being logged on the device.\nThe /var/log/ubnt-daemon.log file will show this line for each disconnect and reconnect:\n2024-06-26 09:56:22 cli: Error: Socket write error: write: Broken pipe The /var/log/lighttpd/ubnt-rtr-ui.log file does not log any activity. The /var/log/lighttpd/error.log file does not log any activity.\nThe /var/log/auth.log file will log activity as follows:\nJun 26 10:22:18 EdgeRouter-X-5-Port login[11078]: pam_unix(login:auth): authentication failure; logname=LOGIN uid=0 euid=0 tty=/dev/pts/2 ruser= rhost= user=admin Jun 26 10:22:21 EdgeRouter-X-5-Port login[11078]: FAILED LOGIN (1) on \u0026#39;/dev/pts/2\u0026#39; from \u0026#39;127.0.0.1:35062\u0026#39; FOR \u0026#39;admin\u0026#39;, Authentication failure Jun 26 10:22:25 EdgeRouter-X-5-Port login[11078]: FAILED LOGIN (2) on \u0026#39;/dev/pts/2\u0026#39; from \u0026#39;127.0.0.1:35062\u0026#39; FOR \u0026#39;admin\u0026#39;, Authentication failure Jun 26 10:22:29 EdgeRouter-X-5-Port login[11078]: FAILED LOGIN (3) on \u0026#39;/dev/pts/2\u0026#39; from \u0026#39;127.0.0.1:35062\u0026#39; FOR \u0026#39;admin\u0026#39;, Authentication failure Jun 26 10:22:33 EdgeRouter-X-5-Port login[11078]: FAILED LOGIN (4) on \u0026#39;/dev/pts/2\u0026#39; from \u0026#39;127.0.0.1:35062\u0026#39; FOR \u0026#39;admin\u0026#39;, Authentication failure Bruteforcing is probably not sexy these days, but is still a valid attack vector and it would be all the more important to fix if you advertise your EdgeRouter publicly.\nAdditional notes #I have contacted Ubiquiti and it has been some time since they have responded to me on whether they think this is a significant issue or not.\n","date":"1 August 2024","permalink":"https://sheran.io/blog/edgemax-websocket-dos/","section":"Blog","summary":"Ubiquiti EdgeRouter Web GUIs may be disabled through an exposed WebSocket using this DoS attack","title":"EdgeMAX Websocket Denial of Service"},{"content":"","date":null,"permalink":"https://sheran.io/tags/vulnerabilities/","section":"Tags","summary":"","title":"Vulnerabilities"},{"content":"Here is how to cross compile gdb or gdbserver on x86_64 for MIPS.\nYou will need the following: # Ubuntu Linux bare metal server or VM running x86_64 (a fairly later version; I used 22.04) Root access to the server Build steps: #1. Get all the sources #We are building from source. The versions are as of writing this post. Here is where I got the sources:\ngdb 15.1 - https://sourceware.org/pub/gdb/releases/gdb-15.1.tar.xz GNU GMP lib v6.3.0 - https://gmplib.org/download/gmp/gmp-6.3.0.tar.xz GNU MPFR lib v4.2.1 - https://www.mpfr.org/mpfr-current/mpfr-4.2.1.tar.xz The two libraries are the GNU Multi Precision Arithmetic Library (gmplib) and the GNU Multi Precision Floating Point Library (libmpfr). These have to be built first before gdb. My assumption is that this is a throwaway server, so you can run all commands as root.\nFirst, let\u0026rsquo;s get the host setup before we download the sources:\napt update \u0026amp;\u0026amp; apt upgrade -y apt install -y build-essential m4 gcc-mipsel-linux-gnu g++-mipsel-linux-gnu I am building gdbserver for testing on a Ubiquiti EdgeRouter that is running mipsel, so I get the compiler for mipsel.\nNow let\u0026rsquo;s get the sources:\nwget https://sourceware.org/pub/gdb/releases/gdb-15.1.tar.xz wget https://gmplib.org/download/gmp/gmp-6.3.0.tar.xz wget https://www.mpfr.org/mpfr-current/mpfr-4.2.1.tar.xz 2. Building the libraries #We have to first build GMP because it is a requirement when building MPFR. tar xvf gmp-6.3.0.tar.xz \u0026amp;\u0026amp; cd gmp-6.3.0 ./configure --host=mipsel-linux-gnu make -j$((`nproc`+1)) make install cd ..\nThen we build MPFR: tar xvf mpfr-4.2.1.tar.xz \u0026amp;\u0026amp; cd mpfr-4.2.1 ./configure --host=mipsel-linux-gnu --with-gmp-build=/root/gmp-6.3.0 make -j$((`nproc`+1)) make install cd ..\n3. Building gdbserver #Now we can finally build gdbserver: tar xvf gdb-15.1.tar.xz \u0026amp;\u0026amp; cd gdb-15.1 ./configure --host=mipsel-linux-gnu --with-gmp-lib=/usr/local/lib --with-mpfr-lib=/usr/local/lib --with-gmp-include=/root/gmp-6.3.0 --with-mpfr-include=/root/mpfr-4.2.1/src make -j$((`nproc`+1)) LDFLAGS=-static\nYou will then find gdbserver in the gdbserver directory or gdb in the gdb directory. I have not tested gdb using the steps above, but I think it will have to be built using the --target flag as specified in this link.\nI build gdbserver statically so that it won\u0026rsquo;t depend on any libraries on the device which may or may not be there. The binary is slightly bigger, but the tradeoff is worth it in this case.\nYou can basically ctrl+c/ctrl+v these commands. I\u0026rsquo;ve tested them a couple of times and they work on a freshly installed Ubuntu 22.04 machine. See the errors section below in case you have problems.\nErrors #If you get some errors during make, it may be possible that the -j flag is causing some jobs to work faster than others. Try again after running make clean and removing the -j flag completely. It may build slower, but it will likely build.\nNext, I copy the file over to my router using scp and run it so I can remote debug processes on the router using IDA Pro or Binary Ninja:\nvbash-4.1# ./gdbserver 0.0.0.0:9898 --attach 17638 Attached; pid = 17638 Listening on port 9898 This allows debugging pid 17638 when you connect to it using your debugger. For example in Binary Ninja, from the menu bar, choose Debugger \u0026gt; Connect to Remote Process\nThen enter the host and port to connect and click Accept.\nConnect to remote process ","date":"30 July 2024","permalink":"https://sheran.io/blog/cross-compile-gdb-for-mips/","section":"Blog","summary":"These are my instructions on how to cross compile gdb and gdbserver for MIPS when you are on an x86_64","title":"Cross compile gdb for MIPS"},{"content":"","date":null,"permalink":"https://sheran.io/tags/debugging/","section":"Tags","summary":"","title":"Debugging"},{"content":"","date":null,"permalink":"https://sheran.io/tags/reverse-engineering/","section":"Tags","summary":"","title":"Reverse-Engineering"},{"content":"In the early 2000s ARP spoofing was our go to party trick to use on internal pentests at Scan Eye Tea. It’s what I used in the bank we got chased away from. A week after my stint as a garbage collector, I was to assume the role of an employee of the bank at their head office for a week. This time they knew I was coming. My mission, should I choose to get paid that month, was to go in with no prior information, find a bunch of administrator level passwords and then use them to gain access to what the bank would classify as: very important systems.\nBack then, we would recommend our clients give us a staging environment to run our tests. However, like Boeing, most of them glossed over this detail and allowed us to run tests in production. So we would ARP spoof in actual production networks. Before we proceed, let me take a brief aside to describe what ARP spoofing is. All computers talk to each other on a network of some sort. A group of computers within the same building will typically be on the same network. Each of these computers will relay information to each other through a gateway. In most cases this gateway is some sort of network switch or router. ARP spoofing is the extremely risky technique of running a smear campaign against the current gateway. Like mudslinging in the US elections, one candidate on the network will bombard all other citizens on the network with propaganda, informing them that the current gateway doesn’t know what it’s doing, and that all network traffic should be entrusted to that candidate instead. As that candidate, you then begin to take on the very important role of sending and receiving traffic on behalf of all the citizens on the network. The intended effect of this is that you now get to see all the network traffic flowing back and forth.\nAfter first ensuring that my flagstone of a laptop was set to forward traffic, often a gotcha in ARP spoofing, I kicked off the negative PR campaign against our friend the HP Switch using Ettercap.\n“Ho ho ho! Just wait until the network hears about what YOU DID!”\nDutifully, all computers on the network then slowly began sending me their data and I began writing it to a file locally and then forwarding it along to its intended destination. Later I would fire up Ethereal and sift through the data to find any useful bits of information. Information such as admins logging into servers using telnet or FTP. Now, if you have even the faintest familiarity of the tools I just mentioned, then I hope your midlife crisis is going well. I recommend getting the lava orange 992 GT3.\nI discovered several admin passwords and some other interesting network ranges to explore. I like to multitask while waiting for longer scans to complete and one thing I normally do is to scan large groups of hosts for one or two ports. So while waiting, I scanned for HTTP ports and found a handful of hosts which I then opened up in my browser. A couple of them were just the web interfaces for the photocopiers and then one bare looking login page for something that seemed like a custom developed portal. I can’t remember what the brand of the copier was, but it had this setting enabled on it that would save a copy of every scanned document to its local storage. I would end up having some fun sifting through some confidential documents later on, but for now I turned my attention to this portal.\nNot the actual ATM. Image Copyright Szymon Kochański One of the passwords I had collected let me in and I saw a two column layout with the larger of the columns on the right containing a map of the UAE. Dotted around the map were little red and green circles and clicking on each of them opened a small info box that had an IP address, hostname, running state and some other stats. What mostly caught my eye was that all the hostnames were prefixed “ATM_”. With shaky hands, I made notes. Always make notes. Always. I collated all the IPs of what I believed was their ATM network and prioritized scanning these. I’ll spare you the details of what came next, and instead summarize as follows:\nEach ATM ran outdated Windows XP with remote desktop enabled. Each ATM had the same administrator password. ATMs would collect and store images of each customer interaction locally. I could shutdown the entire ATM network, but would have to do so one ATM at a time. I didn’t mess with the ATM binary because I felt the access I had gained was sufficient and I also didn’t want to break anything. I didn’t want the intervention of the already angry bank IT team. The Scan Eye Tea team always riled up the client IT teams. I learned that effective pentesting was as much about relationship building and education as it was about skills. Admittedly, I learned this 15 years later only after arrogance eroded with age.\nI earned my keep that week and my reports ended up being my eventual selling point where repeat clients would ask for my involvement on projects. I was in great company with my colleagues and my cybersecurity career was off to a great start. I couldn’t ask for more. ARP Spoofing was the one weird trick that worked that day. It is a double-edged sword however, and not all ARP Spoofing stories have a great ending to them. Maybe I’ll tell you one of those stories another time.\n","date":"15 May 2024","permalink":"https://sheran.io/blog/pentesters-hate-him/","section":"Blog","summary":"ARP spoofing used to be such a powerful tool that no one talks about lately; this is a story of how I owned the ATM network of a bank in the UAE.","title":"Pentesters Hate Him. One weird trick to PWN everything!"},{"content":"","date":null,"permalink":"https://sheran.io/tags/stories/","section":"Tags","summary":"","title":"Stories"},{"content":"","date":null,"permalink":"https://sheran.io/tags/ransomware/","section":"Tags","summary":"","title":"Ransomware"},{"content":"Shook Lin \u0026amp; Bok, a law firm based in Singapore, has paid a ransom of 1.4 million US Dollars in Bitcoin to a ransomware gang. The hostage? Their data, apparently. Shook Lin \u0026amp; Bok is likely not the first of its kind to pay a ransom and it will very likely not be the last. This is sad when you realize that 5% of the ransom is a little over S$100,000 Singapore dollars. For a law firm of that size, I think a 100k would be enough to get some proper endpoint security, phishing simulations, staff training, a bug bounty, and even a pentest or two.\nI hate pitching why security is important to would-be clients. I no longer do it and instead focus on digital forensics and incident response (things that happen AFTER you get hacked.) I hate selling security because I don\u0026rsquo;t think I am good at pointing out the why of it. I feel like my pitch becomes woefully close to the much loathed act of selling insurance: \u0026ldquo;If you don\u0026rsquo;t pentest your infrastructure today, you won\u0026rsquo;t get hacked later in the year.\u0026rdquo; I am also just really bad at sales. I imagine every time I leave one of these pitch sessions the participants gather around and high five each other because they \u0026ldquo;showed me\u0026rdquo; by not paying my fee and instead have saved up their profits to pay out to ransomware gangs just for the news-cycle flex.\nIn my experience companies viewed me with suspicion, most likely founded on an insufficient understanding of technology and security. Does this mean they trust the anonymous ransomware gangs more than an ineffective insurance salesman with no teaching skills? Probably. But then I thought about this a bit more. Is it time that we, as cybersecurity practitioners, do more pro-bono work in educating the masses?\nTo me, this looks like carving out a number of hours in a month to dedicate towards either conducting workshops or having one-on-one calls with SMEs to help demystify why security is important. It isn\u0026rsquo;t enough that we release tiktoks or bite sized LinkedIn posts no matter how much the world supposedly has ADHD. The educational aspect has to be real and this means wilfully setting aside time to teach and answer questions. You absolutely cannot and should not have any expectation of loyalty or lock-in from the client after the session. Your pro-bono work should be dictated by the simple fact that once the time is given, there are zero expectations of \u0026ldquo;value\u0026rdquo; that should come out of it, other than the value of having more savvy SMEs who understand why security is important.\nSo let me start: If you\u0026rsquo;re an SME in Singapore that has questions about ransomware, or even how to structure your in-house security, drop me an email and schedule some time to meet. It won\u0026rsquo;t cost you anything other than your time and there are no stupid questions that you can ask. Speak soon!\n","date":"9 May 2024","permalink":"https://sheran.io/blog/shooklin-can-happen-to-anyone/","section":"Blog","summary":"While what happened to Shook Lin \u0026amp; Bok can happen to anyone, we should explore further what it means to be security practitioners.","title":"Shook Lin \u0026 Bok Can Happen to Anyone"},{"content":"My first role in security was as a Senior Security Engineer. I was living in Dubai and had just been hired by a company called scanit. An acronym for Systems Computers and Network Intrusion Team and an amalgam of something you would do as a security engineer, clever. All our clients in the Gulf pronounced it: Scan Eye Tea.\nI gave up an annual bonus of 5 times my monthly salary and took a hefty pay cut to join them. I worked at the UAE telco called Etisalat (the only one at the time.) I remember telling my wife. She had a look of confusion and incredulity on her face as if I just told her I was off to join a circus. In hindsight there was a bit of juggling, lion taming, and escape artistry, but I never did answer her questions on how I would make things work considering we had a one year old at the time. I think she’s still waiting for a reply 18 years later.\nI landed an interview when I sent in my résumé after discovering the scanit website. A day later, the CEO replied, and I got called in. I remember giving him some spiel that I prepared. Something gross like: “my greatest strength is that I never stop working” and saw a flash of disgust wash over his face. Thinking I blew it and had nothing to lose, I then told him that I was an amazing pentester, would do anything to work in security full-time and was dying a little each day I worked at Etisalat. He then asked me to chat with some of the other engineers in the team. After a slightly awkward conversation because neither interviewers nor interviewee were very adept at doing interviews, we began nerding out over technology where at some point after I had to explain the low level intricacies of how the TCP 3 way handshake and traceroute worked, I was hired.\nscanit didn’t waste much time and had me in the field on my first week out. I received a heavy piece of metamorphic rock on my first day. Confused, I examined it further to realize it was a slate colored Dell laptop. My new peers peered over the half height cubicles to survey my new arsenal. They said that a visiting engineer from Belgium had the same laptop and that it was tough as nails. He had, according to them, taken his backpack containing his laptop on a desert safari and when the Land Cruiser stopped for a break, he got out to stretch his legs a bit. All the dune bashing beforehand had shifted the contents of his backpack which was at his feet and so his laptop had slid out of the half unzipped bag and hit the sand. In a panic to retrieve it he’d leapt out of the SUV and landed squarely on the half buried laptop with both feet. Apart from some scratches apparently it worked just fine afterwards. Well that’s handy then. If I’m unable to hack into a server, I’d at least be able to throw my laptop at it.\nNot the actual laptop. We had won the contract of a rather large local bank in Dubai and it was for the full works. Internal and External Pentest, Web application assessment, ATM hacking, pretty much anything we can do to steal money from the bank. My first outing with the team was on a dumpster diving mission. We were supposed to case one of the bank branches and then steal their trash. This was to catch any instances where sensitive documents were thrown directly in the trash instead of being disposed of securely. There were two big, green rubbish skips in the back of the rectangular building next to a single loading bay. The loading bay was at one of the corners of the longer sides of the building. Three of us sauntered over and began digging through garbage. We hadn’t perfected our technique yet, but it seemed to suffice to use our belly as a pivot point where upper bodies were inside the skip and our legs dangled outside as counterbalance. I was thankful that all the trash was dry and paper based rather than having to pick through the remains of a chicken biryani someone had thrown out the day prior. We didn’t scan too much, but instinctively picked up whatever papers that looked important and stuffed them in our backpacks. We found a mound of shredded paper that we grabbed as much of as possible. We moved steadily for about 5 minutes until I heard someone yelling in Arabic. I looked up and saw two bemused security guards staring at us. I knew that while they looked stocky enough to be menacing, they seemed like they would also need a breather between tying their own shoelaces. Obviously we hadn’t planned this as well as Danny Ocean had and so we couldn’t fathom that the guards would do a walk around mid-afternoon in 40 degree heat. Typically when on social engineering jobs we would revert to an air of confidence that communicated that we were meant to be there and knew what we were doing. In this scenario, we ran. We rounded the corner and ran to the car hearing more huffing and yelling in Arabic behind us. Thankfully our CEO would give us liberal access to his gray Honda Accord AND his driver who was sitting in the car with the engine and A/C running (later that day he would also help us reassemble shredded paper like a jigsaw puzzle.) We piled into the car and simultaneously yelled at him to drive! drive! drive! Which he did. We took off looking back in time to see the guards round the corner. One of them doubled over and supported himself against the building with one arm and looked as though he was about to throw up.\nNot the actual dumpster. We returned to the office with reams of multi-layer carbon paper printouts, many pages both intact and shredded and bank stationery. After about 3 hours worth of sifting through and re-assembling the straight shredded paper, we found absolutely nothing that would help us in our digital onslaught on the bank. We did, however, find out much about its corporate clients, the credit facilities extended to them and in-depth information about them enough to warrant a social engineering attack that we did not end up conducting.\nI had an absolute blast! The work was such a departure from the soul-sucking drudgery I had at Etisalat that I would have gladly done it for free. Looking at my pittance of a paycheck at the time, one could argue that I really was doing it for free. Nevertheless I was energized and looked forward to going into the office. This was all I ever wanted to do. I would daydream of working in a company like this and never imagined I would get to do it in real life. Yet, here I was, working for Scan Eye Tea. The washed out desert sky seemed just a tinge bluer and the piles of desert sand around us seemed sandier.\nLater that evening at home, my wife asked why the midsection of my clothes was so grimy. I explained that I was out with the team prospecting for serious, sensitive information that was insecurely discarded from a bank.\n“So you were out digging through garbage?”, she asked. Ever the reductionist.\nI had no other way to articulate that I didn’t leave a high paying job in vain, just to take a pay cut and start the week out digging through someone else’s garbage. So I pulled the Dell laptop out of my bag and said, “Look, they gave me this new laptop, it’s supposedly very rugged.”\n","date":"25 April 2024","permalink":"https://sheran.io/blog/first-week-in-cybersecurity/","section":"Blog","summary":"This is the story of how I got chased by security when dumpster diving on my first week on the job.","title":"My First Week as a Security Engineer"},{"content":" Update for 8th March 2024: More or less an hour after I posted on x.com, the vulnerability no longer exists. It looks like the fireflies.ai team invalidated their Growthbook.io key.\nI reached out to the Fireflies.ai security team on the 20th Dec 2023 about an issue I found in their browser extension for their service. I received a standard ZenDesk ticket response, but nothing else after. About a week after that, I contacted them through their report a vulnerability form but did not hear back from them yet again. I looked at HackerOne\u0026rsquo;s disclose a vulnerability option to report it, but they make things way too restrictive and nebulous for me to even think of working with them. So here I am writing about this piddly little information leakage bug I found.\nIf you use Fireflies.ai, you know what it is and this post is probably more relevant to you. If not, then I\u0026rsquo;m not giving them free advertising. Fireflies.ai uses Growthbook.io. Growthbook can best be described as the #1 open source feature flagging and experimentation platform. Their words, not mine. I don\u0026rsquo;t actually know what it does but from the little I gathered, you can use it to toggle features on and off in your app without re-publishing it and then see how those changes affect user behavior. Yay growth.\nThis all came about when I was screwing around with my browser DOM because I was trying to read yet another useless article behind a paywall and I saw this one request to cdn.growthbook.io in the developer console\u0026rsquo;s network tab. The response was JSON and contained enough email addresses and domains for me to abandon my pursuit of finding out why AI will be taking over all our jobs next Thursday. A few minutes of grepping (which is, I guess, a glorified way of saying I hit Ctrl+F and searched for a bunch of shit) and I found the source of the request to be the Fireflies.ai Chrome extension I had installed on my browser. It was calling Growthbook\u0026rsquo;s SDK end-point to fetch a list of feature flags. My assumption is that the Fireflies.ai dev team uses Growthbook feature flags for all of their web and mobile apps including the extension. Growthbook has some comparison operators that can be used to determine if and when to turn a feature on or off and the team seem to have made extensive use of that feature. Now I don\u0026rsquo;t know if the Fireflies.ai dev team were told it was a good idea to put emails and domains in Growthbook\u0026rsquo;s feature flags, which was clearly public and available for all to see, or if it was just a shortcut they took. Basically that\u0026rsquo;s the vulnerability.\nAnyone that knows Fireflies.ai\u0026rsquo;s client key and the url to fetch feature flags from Growthbook can download well over 400 email addresses and about 240 domain names that are all using Fireflies.ai. If you look through the source code to the chrome extension, then you should be able to find both client key and url. As a matter of fact, you will find the exact url which contains the client key needed to fetch the list of feature flags. Call that url with curl and you will see the emails and domains of the Fireflies.ai users.\nIn the screenshot above, you can see the conditional checks being done with the feature flags.\nThat\u0026rsquo;s it. That\u0026rsquo;s the finding.\n","date":"23 February 2024","permalink":"https://sheran.io/blog/fireflies/","section":"Blog","summary":"Shortcut leaves hundreds of fireflies.ai user emails open to public","title":"Fireflies.ai leaks emails through Growthbook.io"},{"content":"","date":null,"permalink":"https://sheran.io/tags/information-leak/","section":"Tags","summary":"","title":"Information Leak"},{"content":"","date":null,"permalink":"https://sheran.io/tags/osint/","section":"Tags","summary":"","title":"Osint"},{"content":" Maybe it\u0026rsquo;s because we dragged all our wordlists across from the days of Van Hauser\u0026rsquo;s Hydra way back in 2000. But something happened around the time when the OSCP certification began picking up steam. A wave of new tools, mostly written in either Go or Rust, flooded the interwebs. Along with these tools came a fleet of wordlists. Millions of words in a text file that were to be used for the sole purpose of brute-forcing. I think the most popular set of wordlists can be found here. Looking at the sheer number of wordlists for any occasion, one would think the proverb \u0026ldquo;a rolling stone gathers no moss\u0026rdquo; hasn\u0026rsquo;t really applied here. We\u0026rsquo;ve come so far and just look at all that moss we\u0026rsquo;ve collected! I would hope that we all use SecLists as a starting point and then slowly distil the wordlist down as we get to know our region or country better.\nNevertheless, I thought I would first look at the DNS wordlists in the repo. Find them under /Discovery/DNS. This directory has wordlists that you can use to brute-force subdomains. I wrote a go module to do two things:\nLex the wordlists and check if there are invalid characters for resolving DNS hosts and Do a proper line count of the wordlist. I ran the library on the files inside the DNS directory and here are the results: ➜ sheran@leonov linecount go test -run=TestLexer 2023/11/05 21:51:00 filename: bitquark-subdomains-top100000.txt error: invalid character \u0026#39;*\u0026#39; found at row 37212 col 1 2023/11/05 21:51:00 filename: bug-bounty-program-subdomains-trickest-inventory.txt linecount: 1613291 2023/11/05 21:51:00 filename: combined_subdomains.txt error: invalid character \u0026#39;*\u0026#39; found at row 1 col 1 2023/11/05 21:51:00 filename: deepmagic.com-prefixes-top500.txt linecount: 500 2023/11/05 21:51:00 filename: deepmagic.com-prefixes-top50000.txt error: invalid character \u0026#39;_\u0026#39; found at row 4715 col 7 2023/11/05 21:51:00 filename: dns-Jhaddix.txt error: invalid character \u0026#39;@\u0026#39; found at row 4 col 1 2023/11/05 21:51:00 filename: fierce-hostlist.txt error: invalid character \u0026#39;_\u0026#39; found at row 770 col 4 2023/11/05 21:51:00 filename: italian-subdomains.txt linecount: 20000 2023/11/05 21:51:00 filename: n0kovo_subdomains.txt error: invalid character \u0026#39;\\n\u0026#39; found at row 240002 col 1 2023/11/05 21:51:00 filename: namelist.txt error: invalid character \u0026#39;_\u0026#39; found at row 4979 col 8 2023/11/05 21:51:00 filename: remain.txt linecount: 1497687 2023/11/05 21:51:00 filename: shubs-stackoverflow.txt error: invalid character \u0026#39;,\u0026#39; found at row 807 col 24 2023/11/05 21:51:00 filename: shubs-subdomains.txt error: invalid character \u0026#39;_\u0026#39; found at row 32597 col 8 2023/11/05 21:51:00 filename: sortedcombined-knock-dnsrecon-fierce-reconng.txt error: invalid character \u0026#39;_\u0026#39; found at row 2242 col 1 2023/11/05 21:51:00 filename: subdomains-spanish.txt error: invalid character \u0026#39; \u0026#39; found at row 411 col 8 2023/11/05 21:51:00 filename: subdomains-top1million-110000.txt error: invalid character \u0026#39;_\u0026#39; found at row 689 col 4 2023/11/05 21:51:00 filename: subdomains-top1million-20000.txt error: invalid character \u0026#39;_\u0026#39; found at row 689 col 4 2023/11/05 21:51:00 filename: subdomains-top1million-5000.txt error: invalid character \u0026#39;_\u0026#39; found at row 689 col 4 2023/11/05 21:51:00 filename: tlds.txt error: invalid character \u0026#39;[\u0026#39; found at row 1411 col 11 2023/11/05 21:51:00 files processed: 20 errors: 15 PASS ok github.com/sheran/linecount 0.280s ➜ sheran@leonov linecount\n15 of the 20 files in that directory had invalid characters. By that I mean, if you ran that through gobuster\u0026rsquo;s DNS brute-forcer, those subdomains won\u0026rsquo;t resolve. That\u0026rsquo;s because the RFC for DNS has a preferred name syntax (section 3.5) where the ruleset is as follows:\n\u0026lt;subdomain\u0026gt; ::= \u0026lt;label\u0026gt; | \u0026lt;subdomain\u0026gt; \u0026#34;.\u0026#34; \u0026lt;label\u0026gt; \u0026lt;label\u0026gt; ::= \u0026lt;letter\u0026gt; [ [ \u0026lt;ldh-str\u0026gt; ] \u0026lt;let-dig\u0026gt; ] \u0026lt;ldh-str\u0026gt; ::= \u0026lt;let-dig-hyp\u0026gt; | \u0026lt;let-dig-hyp\u0026gt; \u0026lt;ldh-str\u0026gt; \u0026lt;let-dig-hyp\u0026gt; ::= \u0026lt;let-dig\u0026gt; | \u0026#34;-\u0026#34; \u0026lt;let-dig\u0026gt; ::= \u0026lt;letter\u0026gt; | \u0026lt;digit\u0026gt; \u0026lt;letter\u0026gt; ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case \u0026lt;digit\u0026gt; ::= any one of the ten digits 0 through 9 So that\u0026rsquo;s it. a-z, A-Z, 0-9, and -. Those are the only characters that are allowed when looking up a host. This means that all the words with invalid characters will not resolve correctly and worse, will make your brute-force task slower. So why are these weird characters even there?\nA few reasons. First, I know that there were programs that would do something with a wildcard like \u0026quot;*\u0026quot;. The program itself would expand this to mean take hostname \u0026quot;starfish*\u0026quot; and expand it to mean \u0026quot;starfish1\u0026quot;, \u0026quot;starfish2\u0026quot;, \u0026hellip; \u0026quot;starfish9\u0026quot;. But the modern day tools like gobuster don\u0026rsquo;t do this. So essentially you\u0026rsquo;re ruining the efficiency of your already long ass brute-forcing session.\nWhat should you do about it? Well, in my opinion, it makes sense to clean the files so that they only contain subdomains that have a chance of successfully resolving. That takes care of one part of the mess that is DNS brute-forcing. The next part involves developing a repeatable process for taking all the successful hits from these files and building a smaller more concentrated file of working subdomains. This can greatly speed up your discovery process.\n","date":"5 November 2023","permalink":"https://sheran.io/blog/wordlists/","section":"Blog","summary":"Cleaning up our cybersecurity brute force wordlists one character at a time.","title":"Our Wordlists Kinda Suck"},{"content":"I\u0026rsquo;ve been getting back into DFIR and I was testing out this tool called Cyber Triage. I discovered it when I saw a friend of mine had a workshop that he was doing and I duly registered for it. It\u0026rsquo;s a neat tool that helps an investigator through his examination process. It collects the usual data on a system like metadata, user activity and the places where malware tends to persist. It also collects volatile data. In addition to collection, tool also helps with prioritisation and recommendation. This means that it directs you to look at what it deems are higher priority items and then provides guidance on how to tackle them. I eagerly downloaded the tool, and suddenly remembered when I tried to install it that DFIR is still predominantly a Windows game. The tool needed to be installed on a Windows machine and of course, the collection tool as well would only work on a Windows machine. Not wanting to heap judgment on a tool based on what it cannot do, I optimistically dusted off my Microsoft Surface to install the tool on and collect data.\nThis is not a review of Cyber Triage. Cyber Triage has a lot of cool and interesting features that you should probably check out by yourself with an evaluation copy, but for me, I\u0026rsquo;m currently focusing on one objective. But let\u0026rsquo;s not get ahead of ourselves and let\u0026rsquo;s first colect some data from my Windows machine.\nOpening Screen When you say you want to Triage the Local Host, you get this popup which to me makes sense.\nRecommended Use Case You probably should not be collecting artifats from a compromised system and should instead use the collection tool that they offer. So I did just that. Selecting other host, you will see the option to use the Collection Tool at the bottom bar of the window:\nCollection Tool Selecting this option extracts a set of files (together make up the collection tool) that you can then copy on an external drive and use to collect artifacts from a system that you\u0026rsquo;re investigating.\nCollection Tool - Files When you run the tool, you first configure it tell it to which level you want to collect artifacts and then run it. I ran it for both a VM and my Microsoft Surface and it generated two gzipped files in the output directory:\nCollected Data In the image above, the Surface file is the 2.7 GB gzipped file. It\u0026rsquo;s quite a hefty file and that is because it also includes all relevant files on the system. One thing you may also notice is that the file is a json file. I was elated when I saw this because it meant that I could immediately stop working on the Windows machine and move the analysis to a Linux or MacOS box where I feel more at home.\nThe first thing I did was to decompress the file to understand just what kind of beast I was dealing with. Uncompressed, the file came out to about 4.1Gb\n-rw-r--r-- 1 sherangunasekera staff 4.1G Sep 20 22:45 cttout_DESKTOP-BCC5PAU_20230920_12_51_14.json\nSo already it\u0026rsquo;s going to be a challenge to open, let alone inspect a 4.1Gb json file. I tried opening it on my Mac Studio using Sublime Text, but that took too long. It\u0026rsquo;s quicker to open it on the terminal using less, but again, paging through screen after screen can get tedious. Then I thought, \u0026ldquo;Why not use jq?\u0026rdquo;. For the uninitiated, jq is a command line json file processor that can work on and extract data from a json file. The downside of jq is that you have to almost learn an entire language before you get going. But in reality, you can get up to speed fairly quickly. So in order to process my behemoth of a json file, all I had to do was to stream the gzip file using gzcat and then pipe it through to jq. Simple right? Let\u0026rsquo;s try that:\n➜ sherangunasekera@Sherans-Mac-Studio ctlite git:(master) ✗ gzcat cttout_DESKTOP-BCC5PAU_20230920_12_51_14.json.gz | jq empty jq: parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 84413, column 94 Well shit. I should have expected the quirks that come with a Windows Mac/Linux operation. Just where and what is this invalid string with un-escaped control characters? I open up the uncompressed json file with less and head on down to line 84413 to find:\nOffending Control Chars Ah yes, there seem to be some weird characters when the tool collected the PE Headers of executable files. I didn\u0026rsquo;t actually bother trying to figure out what they are, because my current goal is to get the file to work with jq so that I can dig deeper into its format. So I started looking for ways in which I could fix these invalid control characters while streaming the file.\nEither I was impatient and didn\u0026rsquo;t spend enough time on the problem, or the general consensus seems to be that a json file should NOT contain characters not allowed by the spec. If by chance you do happen to have a file that doesn\u0026rsquo;t conform to the spec, then you\u0026rsquo;re shit outta luck. I choose to believe the latter. After browsing many Stack Overflow and blog posts and cajoling ChatGPT into giving me a one liner this one seemed to work:\nFixing Invalid Control Chars with a Perl one-liner But a minute and twelve seconds to run? Jeez! That seems unusually slow. To breakdown the command:\ngzcat cttout_DESKTOP-BCC5PAU_20230920_12_51_14.json.gz | \\ perl -C -pe \u0026#39;s/([^\\x20-\\x7E])/sprintf(\u0026#34; \u0026#34;)/ge\u0026#39; | jq empty gzcat \u0026lt;filename\u0026gt; will \u0026ldquo;cat\u0026rdquo; or print the output to standard out after decompressing it. So think of it like you stream and decompress bytes to stdout without having to first decompress the file and then cat the contents. This helps to not use up your disk space. perl -C -pe 's/([^\\x20-\\x7E])/sprintf(\u0026quot; \u0026quot;)/ge' Will look at the stream and replace any character not in printable range (0x20 - 0x7E) with a space. Basically this will replace all invalid control chars with a space. jq empty will run the stream through jq but will not print anything out to stdout. This is a good way to validate any json files that you have. If it returns without an error, then your json file is valid, if not you\u0026rsquo;ve got an error. I can technically run this and do my examination on the format of the file, but it will take me a minute and change each time I run it. I didn\u0026rsquo;t feel like that was productive so I thought I\u0026rsquo;d try to improve on the speed of processing the gzipped json file. In theory, I could have just stopped here and written that json file out to disk and worked on that, but I constrained myself to not working on the decompressed gzip file which was twice the size of the gzipped one. Also note that I am destructively fixing this file, meaning I am replacing each invalid control character with a space. In DFIR this is an obvious no-no because you really want to work with the original collected data as much as possible. There are probably better solutions on how to do this, but that is a post for another day. When I say \u0026ldquo;destructive\u0026rdquo; I am still not making any permanent changes to the collected data because I am only operating on the stream in memory.\nI set out to write a go program to do what I needed to do. The journey was very long and took almost a month of me working on it over the weekends because I was obsessed. I am not going to post all the code I wrote, but will give you a break down of what and how things transpired:\nNaive Go #I wrote a go program to read the gzipped contents, clean out invalid control characters and output it to a json file. I had to write a transformer using \u0026ldquo;golang.org/x/text/transform\u0026rdquo; to filter out the characters I didn\u0026rsquo;t want. It took a minute at 35 seconds to execute and extract the json file. I could then run jq on that file and work with that. That took about 30 seconds to execute per try which was not too bad, but it goes against my constraints for not having the decompressed json file lying around.\nNaive Go Using pgzip #It turns out that the encoding/gzip in the standard library is not as quick as it can be. So after hunting around I discovered pgzip. This is purpotedly a faster version of gzip which you can use as a drop in replacement. This was great because I didn\u0026rsquo;t have to change any of my code. This got me an entire 24 seconds shaved off which was quite impressive! But I felt I could do better.\nUsing pgzip By then, I also discovered gojq which is a pure go implementation of jq. This had me thinking that I could run all my analysis or extraction directly bundled into one go program. While writing all this code and looking deep into json processing, I discovered that using a transformer the way I did was extremely slow and wasteful. Having achieved some success with pgzip, I looked around to see if there was a faster json processor. It seemed like there were a few, and for my use case, I ended up selecting json-iter.\nThere was just one problem though. All json processors I looked at adhered very strongly to the spec that you cannot have certain control characters when reading a file. They would all throw errors with no way to adapt them into my program. So I ended up having to form and modify json-iter to adapt to my use case. Specifically, I changed some code in the iter_str.go file like this (line 14-19):\n// ReadString read string from iterator func (iter *Iterator) ReadString() (ret string) { c := iter.nextToken() if c == \u0026#39;\u0026#34;\u0026#39; { for i := iter.head; i \u0026lt; iter.tail; i++ { c := iter.buf[i] if c == \u0026#39;\u0026#34;\u0026#39; { ret = string(iter.buf[iter.head:i]) iter.head = i + 1 return ret } else if c == \u0026#39;\\\\\u0026#39; { break } else if c \u0026lt; \u0026#39; \u0026#39; { // iter.ReportError(\u0026#34;ReadString\u0026#34;, // fmt.Sprintf(`invalid control \\ // character found: %d`, c)) // return c = 0x20 iter.buf[i] = 0x20 } } return iter.readStringSlowPath() } else if c == \u0026#39;n\u0026#39; { iter.skipThreeBytes(\u0026#39;u\u0026#39;, \u0026#39;l\u0026#39;, \u0026#39;l\u0026#39;) return \u0026#34;\u0026#34; } iter.ReportError(\u0026#34;ReadString\u0026#34;, `expects \u0026#34; or n, \\ but found `+string([]byte{c})) return } In the character check, if the character byte value is smaller than 0x20 then the original behaviour is to exit with an error. I changed that to not throw an error and instead to replace that character and the character in the buffer with 0x20 (which is a space). After making those changes, I went ahead and ran my program which used gojq, pgzip and json-iter. Basically the program behaves like gzcat and jq all rolled into one which also filters out invalid control characters:\nUsing pgzip and json-iter 24 seconds!!! I had dropped processing by 47 seconds! This was pretty much time to celebrate. I can basically probe this output file as much as I want by running various jq queries on it and only wait 24 seconds for it to return with results. This vastly speeds up my analysis of the file. Getting very excited about this, I decided to cross compile a version and run it on a Windows VM to see what kind of performance I can get on that:\nOut of Memory on Windows VM Well that didn\u0026rsquo;t work. I ran out of memory. Back to the drawing board I guess. It would seem, after more inspection, that jq reads the entire decompressed data stream into memory before it executes.\n7.4Gb Peak Memory Footprint Look at that! 7446395840 bytes or 7.44Gb of maximum memory usage. This is why my lowly 4Gb RAM VM crapped out. So it appears that jq and also gojq do not operate on a stream by default. But they do have an option to by using the \u0026ldquo;—stream\u0026rdquo; option which then acts on the stream. The drawback to this is that you have to adapt your querying approach with jq. It isn\u0026rsquo;t a huge deal, but enough to be a pain in the ass.\nI took some time in adapting my code to use gojq\u0026rsquo;s streaming because there are no examples of how to do this. You have to look through the source and figure things out. That was pretty much what I spent time doing and finally had my program running the way I wanted it to:\nMemory usage with \u0026ndash;stream Takes a bit longer at 38 seconds but looking at the peak memory usage, it seems way better. Only 1.66Gb used. Let\u0026rsquo;s try running that on our Windows VM now:\nExec time on low-resource Windows VM Not too bad for a very limited VM, I guess. At least I know it can also be used on Windows. I still have not found a way to log the maximum memory usage in Windows, but I did see it get to around 1.5 to 1.6Gb from the Resource Monitor. Here\u0026rsquo;s how it grew:\nTracking Memory usage over time And with that we come to how to actually use the go program. Well, if using it with the jq stream queries, then to get all progress messages from Cyber Triage, you can run:\n➜ sherangunasekera@Sherans-Mac-Studio cybertriage git:(master) ✗ ./cybertriage ~/go/src/github.com/sheran/ctlite/cybertriage_evaldata_20230822.json.gz \\ \u0026#34;select(.[0][4] == \\\u0026#34;progress\\\u0026#34; and .[0][5] == \\\u0026#34;message\\\u0026#34;)|select(.[1]|length \u0026gt; 0)|.[1]\u0026#34; Creating temporary working folder Opening target image/disk Enumerating Files... (Step 1 of 13) Collecting Network Caches... (Step 2 of 13) Collecting DNS Cache. DNS Cache collection started. Collecting ARP cache. Analyzing routing tables. Searching for system registry files Searching for system registry files Searching for system registry files Collecting Users... (Step 3 of 13) Searching for user registry files Searching for user registry files Searching for user registry files Analyzing Startup Items... (Step 4 of 13) Analyzing Programs Run... (Step 5 of 13) Collecting Network Shares... (Step 6 of 13) Collecting System Configuration... (Step 7 of 13) Analyzing Scheduled Tasks ... (Step 8 of 13) Searching for scheduled task files Searching for scheduled task files Searching for scheduled task files Searching for event logs Searching for event logs Searching for event logs Analyzing Event Logs... (Step 9 of 13) Analyzing event logs... Parsing security log started. Parsing security log completed. Parsing microsoft-windows-terminalservices-localsessionmanager/operational log started. Parsing microsoft-windows-terminalservices-localsessionmanager/operational log started. Parsing Microsoft-Windows-TerminalServices-RDPClient/Operational log started. Parsing Microsoft-Windows-TerminalServices-RDPClient/Operational log started. Analyzing Application \u0026amp; Service event logs... Collecting Processes... (Step 10 of 13) Collecting Network Connections and Ports... (Step 11 of 13) Collecting WMI database files Collecting Powershell profile startup items Processing startup folders Searching for startup files Searching for startup files Searching for startup files Analyzing startup folder: /ProgramData/Microsoft/Windows/Start Menu/Programs/Startup Processing Prefetch files Searching for Prefetch files Searching for Prefetch files Searching for Prefetch files Searching for StartupInfo files Searching for StartupInfo files Searching for StartupInfo files Collecting Web Files ... (Step 12 of 13) Collecting Firefox Databases Collecting Chrome Databases Collecting Downloads Collecting Internet Explorer files Collecting Microsoft Edge files End of scenario data Additional data Analyzing All Files... (Step 13 of 13) That is an example I ran on the Cyber Triage evaluation data that they provided.\nWith this approach, it becomes quite easy to understand the Cyber Triage collection format and thereby continue your investigation on Linux or MacOS on the command line.\nThis post is kinda long, so I\u0026rsquo;m going to wrap it up here. I ended up talking more about how to effectively work with a roughly 4Gb, gzipped json file within memory and time constraints than about Forensics. Of course this tool is also applicable to non forensics use cases, though I wonder which other tech discipline will require cleaning out gargantuan json files.\nEven as I wrap up this chapter, I have already written a similar cleaner in Zig and have seen some very promising results. Stay tuned I guess and maybe I\u0026rsquo;ll talk about that in the future.\n","date":"22 October 2023","permalink":"https://sheran.io/blog/cybertriage/","section":"Blog","summary":"How to speed up DFIR workflow and read a Cyber Triage collection file through the CLI","title":"Cyber Triage on MacOS?"},{"content":"","date":null,"permalink":"https://sheran.io/tags/dfir/","section":"Tags","summary":"","title":"Dfir"},{"content":"","date":null,"permalink":"https://sheran.io/tags/forensics/","section":"Tags","summary":"","title":"Forensics"},{"content":"","date":null,"permalink":"https://sheran.io/tags/json/","section":"Tags","summary":"","title":"Json"},{"content":"","date":null,"permalink":"https://sheran.io/tags/parsing/","section":"Tags","summary":"","title":"Parsing"},{"content":"I do security audits and penetration testing for web and mobile applications.\nPreviously, I was the founding CTO of Gojek (now GoTo), where I built their first cybersecurity team and scaled infrastructure from thousands to millions of users. Before that, I launched Zalora\u0026rsquo;s Indonesia e-commerce platform as VP of Tech. I\u0026rsquo;ve written two books on mobile security and have 20+ years of experience across startups, banks, telcos, and oil \u0026amp; gas.\nSome highlights: reverse engineering UAE state-sanctioned Blackberry malware, demonstrating a 400% payout exploit on a bank\u0026rsquo;s fixed deposit system, and getting chased by security guards while dumpster diving at a Dubai bank.\n","date":null,"permalink":"https://sheran.io/about/","section":"Home","summary":"\u003cp\u003eI do security audits and penetration testing for web and mobile applications.\u003c/p\u003e\n\u003cp\u003ePreviously, I was the founding CTO of Gojek (now GoTo), where I built their first cybersecurity team and scaled infrastructure from thousands to millions of users. Before that, I launched Zalora\u0026rsquo;s Indonesia e-commerce platform as VP of Tech. I\u0026rsquo;ve written \u003ca href=\"https://link.springer.com/search?dc.creator=Sheran\u0026#43;Gunasekera\u0026amp;facet-content-type=%22Book%22\" target=\"_blank\" rel=\"noreferrer\"\u003etwo books\u003c/a\u003e on mobile security and have 20+ years of experience across startups, banks, telcos, and oil \u0026amp; gas.\u003c/p\u003e","title":"About"},{"content":"","date":null,"permalink":"https://sheran.io/categories/","section":"Categories","summary":"","title":"Categories"}]