Powershell byte count formatting, simplified

It was observed that the fuction I was using for converting byte counts to KB/MB/GB format was over-complicated, so I resolved to simplify. Going back to the drawing board, I came up with this:


Filter ConvertTo-KMG {

         <#
         .Synopsis
          Converts byte counts to Byte\KB\MB\GB\TB\PB format
         .DESCRIPTION
          Accepts an [int64] byte count, and converts to Byte\KB\MB\GB\TB\PB format
          with decimal precision of 2
         .EXAMPLE
         3000 | convertto-kmg 
         #>

         $bytecount = $_

            switch ([math]::truncate([math]::log($bytecount,1024))) {

                      0 {"$bytecount Bytes"}

                      1 {"{0:n2} KB" -f ($bytecount / 1kb)}

                      2 {"{0:n2} MB" -f ($bytecount / 1mb)}

                      3 {"{0:n2} GB" -f ($bytecount / 1gb)}

                      4 {"{0:n2} TB" -f ($bytecount / 1tb)}

                Default {"{0:n2} PB" -f ($bytecount / 1pb)}

              }

          }

12 responses to “Powershell byte count formatting, simplified

  1. OK, I couldn’t stop myself from trying to write it in slightly different manner. Yours is better for sure, but… 🙂 Take a look at this:
    filter Get-BestValue {
    $Value = $_
    “{0:N2} {1}” -f $(
    switch -Regex ([Math]::Log($Value,1024)) {
    ^0 {
    $Value, ‘Bytes’
    }
    ^1 {
    ($Value/1KB), ‘KB’
    }
    ^2 {
    ($Value/1MB), ‘MB’
    }
    ^3 {
    ($Value/1GB), ‘GB’
    }
    default {
    ($Value/1PB), ‘PB’
    }
    })
    }

  2. That’ll work, too 🙂 . I think the big improvement comes from using the log 1024 function to eliminate all the repetitive divide and compare operations you normally see used to determine what factor label you need to use.

  3. Yes, THAT part was actually responsible for 99% percent of my “WOW” effect. It’s like Columbus’s egg: so obvious when you see it, so hard to figure out until you do. 🙂

  4. Hello mjolinor, hello Bartek,

    very interesting solutions to one oth the problems presented in #2012sg, too!
    I remember, that we should use the most appropiate unit of measurement to represent filesizes and memory amounts.
    My solution was fine, but not optimal in terms of performance …
    I think that the easiest solution is the fastest and most comprehensive one:

    filter Get-BestValue2 {
    “{0:N2} {1}” -f $(
    if ($_ -lt 1kb) { $_, ‘Bytes’ }
    elseif ($_ -lt 1mb) { ($_/1kb), ‘kb’ }
    elseif ($_ -lt 1gb) { ($_/1mb), ‘mb’ }
    elseif ($_ -lt 1tb) { ($_/1gb), ‘gb’ }
    elseif ($_ -lt 1pb) { ($_/1tb), ‘tb’ }
    else { ($_/1pb), ‘pb’ }
    )
    }

    Klaus (Schulte)

  5. Hello, Klaus. Thanks for the feed back (it’s always good to get comments).

    I used to pursue “optimal performance” as a primary criteria in my scripts, but have since reconsidered that position. Now I look first for the solution that presents itself intuitively to the person reading it.

    I will optimize a solution for performance when circumstances warrant (typically code inside a loop that is going to be interated though thousands of times).

    This all gets to be kind of subject and philosophical, but basically it comes to this: people time is MUCH more expensive than processor time, and that gap grows wider every day. If I can trade a few extra milliseconds of processor time in execution for even a minute of my or someone else’s time if they ever have to go back and debug or modify it, it’s probably worth it.

    That’s just my personal philosophy at the moment, and I’m always happy to enterain any argument that might change my mind about that. I’ve been wrong before, will be again, and might be right now :).

  6. I could not agree more. Scripts are not the same as compiled code. So users may often what to see ‘guts’ and maybe tweak it. If with better performance you loose too much readability – you are not really helping anybody. Obviously, if performance is a key and difference is huge (like some examples where Rob used hashtables instead of native commands, if I recall correctly) – than this is totally different story. Here – performance is not the key. And I really love simplicity of using log() to calculate best unit. It’s obvious what is going on, right? 🙂

  7. @Bartek:As all the #sg2012 scripts are rated by now, I may disagree with the judge in you -)
    Using log() isn’t as obvious as conditional logic, if you aren’t a mathematician!
    I showed the log() solution and the simple if-then-else solution to a couple of colleagues and friends and none of them had eveidence in the fact that the log function did the right thing on first sight, they would have had to consult their old mathematics book first … to assure that this works.
    That a statement like
    “if value is less than 1Kilobytes then output the result as Bytes”
    “elseif value is less than 1Megabytes then output the result as KiloBytes”

    is correct, was obvious at first glance!
    Klaus.

    • Klaus, I think we’re just looking at it from different perspectives.

      When I say I say I look for a solution that “presents itself intuitively”, I a talking about lexical presentation, with the primary concern being to convey what that piece of code is doing. This is not the same thing as consideration how it presentents itself logically with the primary concern being to convey how it’s doing it.

      In this case, what the code is doing is dividing the input value by a chosen number, and outputting a formatted string of the result. What makes it more intuitive to read is that what it’s doing is presented lexically isolated from the gory details of how it’s doing it.

      The choice of using a switch statement rather than a stack of If statements is part of that.

      Any Switch statement can be replaced with a stack of If statements, so why have it at all? I believe the answer to that is that it produces a more lexically intuitive presentation of what the code is doing by getting the “What” out of the middle of the “How”.

      In the same vein, I find code that creates an object from a hash table to be more intuitive to read than the code to create the same object from a stack of add-member commands, and I find it better to read code that implements a command that uses numerous parameters if they’re splatted from a hash table.

  8. Dear mjolinor,

    I totally agree with you and I love the hashtable approach, too!
    In fact I’ve just been testing some alternatives with a single goal: speed!
    I did use the switch statement in the #2012sg event and I like it.
    But it’s slow. I even added “break” statements and tried to use “return” instead, which even made it even slower! That’s really a totally different perspective 🙂
    Anyway … good to meet you here!
    ( btw: I’ve started thinking of my own blog, which would address to beginners … such as “from CMD to POWERSHELL” … if I have the time … )

    Klaus.

  9. @Klaus – well, you got me. I always loved maths. Seriously. For me using log is just pure eye candy. And I not only don’t have to think what it will do, it’s obvious and just stunningly easy so I just love to see it applied like that.
    Regarding switch versus if: switch is in so many ways more powerfull than stack of ifs’ that I hardly use latter. I use if only if I’m sure there will be only two options (even elseif is usually enough to push me for swich).
    And regarding grades… well, it’s not over yet (evil laughter), let me seeee.. 😉

  10. Pingback: More on output… | IT Pro PowerShell experience

  11. I totally agree with the short solution of using the Log() function, which simplifies greatly the code.
    One consideration though: for the logarithm of 0 to any base greater than 1 (like 1024) Powershell returns “-Infinity”, so your function with a parameter of 0 bytes return “0.00 PB”. I fixed my code by adding this first line:
    if ($bytes -le 0) {return “0 bytes”}
    Thank you very much

Leave a reply to Bartek Bielawski Cancel reply