<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>Ali&#x27;s Blog</title>
    <link rel="self" type="application/atom+xml" href="https://abzrg.github.io/atom.xml"/>
    <link rel="alternate" type="text/html" href="https://abzrg.github.io"/>
    <generator uri="https://www.getzola.org/">Zola</generator>
    <updated>2026-06-02T00:00:00+00:00</updated>
    <id>https://abzrg.github.io/atom.xml</id>
    <entry xml:lang="en">
        <title>x86-64 Assembly on an M-series Mac</title>
        <published>2026-06-02T00:00:00+00:00</published>
        <updated>2026-06-02T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://abzrg.github.io/blog/x64-assembly-apple-macs/"/>
        <id>https://abzrg.github.io/blog/x64-assembly-apple-macs/</id>
        
        <content type="html" xml:base="https://abzrg.github.io/blog/x64-assembly-apple-macs/">&lt;p&gt;If you&#x27;re looking for a way to code and execute x86-64 programs on arm64 macs, this is a tutorial for you.
Through &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;support.apple.com&#x2F;en-us&#x2F;102527&quot;&gt;Rosetta 2&lt;&#x2F;a&gt; it is possible to compile x86-64 programs, whether written in C or assembly, into x86-64 binary and be able to run it as if was an arm64 binary.
Here, I will specifically demonstrate how one can write assembly code and call them in a C code.&lt;&#x2F;p&gt;
&lt;hr &#x2F;&gt;
&lt;p&gt;It is really simple.
Write your assembly code in a &lt;code&gt;.s&lt;&#x2F;code&gt; file.
Within it expose a symbol that you later on want to call in a C file.&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;asm&quot;&gt;  .text
  .global _my_func
_my_func:
   ...
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;In this particular example, we made the _my_func symbol visible to the linker.&lt;&#x2F;p&gt;
&lt;p&gt;Then in a C file, to be able to compile the code, before even linking it against the assembly code, declare the function that corresponds to the exposed symbol in the assembly file.&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;c&quot;&gt;&#x2F;&#x2F; ~~~

extern &amp;lt;return_type&amp;gt; my_func(&amp;lt;params&amp;gt;...);

&#x2F;&#x2F; ~~~

int main(void)
{...}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;blockquote class=&quot;markdown-alert-tip&quot;&gt;
&lt;p&gt;You don&#x27;t have to specify the &lt;code&gt;extern&lt;&#x2F;code&gt; keyword before the function prototype since functions in C are by default external.
However, for readability sake, I like to keep the &lt;code&gt;extern&lt;&#x2F;code&gt; keyword to
indicate the function is defined in an assembly file.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;Finally, we compile the whole thing by passing both, C and assembly, files to the compiler as well as the &lt;code&gt;-target x86_64-apple-macos&lt;&#x2F;code&gt; flag.&lt;&#x2F;p&gt;
&lt;blockquote class=&quot;markdown-alert-note&quot;&gt;
&lt;p&gt;C Compilers are able to do cross-compilation and produce binaries that target a specific architecture and ABI.
In this case, by passing this flat, we essentially tell it to&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;assemble into x86-64 instructions&lt;&#x2F;li&gt;
&lt;li&gt;an in doing so use Mac&#x27;s x86-64 ABI and Mach-o binary file format.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;It results in generation of an executable that runs on Intel x86_64 binary.
But, again, because of rossetta2, ARM64 Macs are able to translate those instructions into ARM64 instruction on the fly.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;h2 id=&quot;example-adding-integer-arrays-with-simd-operations&quot;&gt;Example: Adding Integer Arrays with SIMD Operations&lt;&#x2F;h2&gt;
&lt;p&gt;Finally, I&#x27;m gonna leave you with a simple example.
Let&#x27;s use &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.google.com&#x2F;search?q=sse+instructions&amp;amp;oq=sse+instructions&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8&quot;&gt;SSE instructions&lt;&#x2F;a&gt; to add two arrays of integers.&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s go through the steps I mentioned above.
Here&#x27;s the assembly code.&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;asm&quot;&gt;# file: add_simd.s

.text
.globl _add_arrays

# void add_arrays(
#     const int *a,   rdi
#     const int *b,   rsi
#     int *out,       rdx
#     int n           rcx
# )

_add_arrays:
    xorq %r8, %r8

.loop:
    cmpq %rcx, %r8
    jge .done

    # Load four integers from `a` (%rdi)
    movdqu (%rdi,%r8,4), %xmm0

    # Load four integers from `b` (%rdi)
    movdqu (%rsi,%r8,4), %xmm1

    # Add four integers in parallel
    paddd %xmm1, %xmm0

    # Store the result
    movdqu %xmm0, (%rdx,%r8,4)

    addq $4, %r8
    jmp .loop

.done:
    ret
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;blockquote class=&quot;markdown-alert-note&quot;&gt;
&lt;p&gt;Note the exposed symbol &lt;code&gt;_add_arrays&lt;&#x2F;code&gt;, which matches a C function with name &lt;code&gt;add_arrays&lt;&#x2F;code&gt;.
This convention is part of MacOS&#x27;s x86-64 ABI.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;In the C code, we declare the &lt;code&gt;add_arrays&lt;&#x2F;code&gt; function at the top the file.&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;c&quot;&gt;&#x2F;&#x2F; file: main.c

extern void
add_arrays(const int *a,
           const int *b,
           int *out,
           int n);
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Testing the code:&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;c&quot;&gt;&#x2F;&#x2F; file: main.c

#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdint.h&amp;gt;

extern void add_arrays(const int *a, const int *b, int *out, int n);

int main(void)
{
    int a[] = {1, 2, 3, 4, 5, 6, 7, 8};
    int b[] = {10, 20, 30, 40, 50, 60, 70, 80};
    int out[8];

    add_arrays(a, b, out, 8);

    for (int i = 0; i &amp;lt; 8; i++) {
        printf(&amp;quot;%d &amp;quot;, out[i]);
    }

    printf(&amp;quot;\n&amp;quot;);
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;blockquote class=&quot;markdown-alert-note&quot;&gt;
&lt;p&gt;Each XMM register above is 128 bits wide, which means it can hold, 4 (32-bit) integers, so every loop iteration performs something like the following&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;text&quot;&gt;out[i + 0] = a[i + 0] + b[i + 0]
out[i + 1] = a[i + 1] + b[i + 1]
out[i + 2] = a[i + 2] + b[i + 2]
out[i + 3] = a[i + 3] + b[i + 3]
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;with a single SIMD addition instruction.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;Now, to compile&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;console&quot;&gt;$ clang -target x86_64-apple-macos main.c add_simd.s
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;We can inspect to see what kind of file it&#x27;s generated&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;console&quot;&gt;$ file a.out
a.out: Mach-O 64-bit executable x86_64
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Running it,&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;console&quot;&gt;$ .&#x2F;a.out
11 22 33 44 55 66 77 88
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>First Post</title>
        <published>2026-06-01T00:00:00+00:00</published>
        <updated>2026-06-01T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://abzrg.github.io/blog/first-post/"/>
        <id>https://abzrg.github.io/blog/first-post/</id>
        
        <content type="html" xml:base="https://abzrg.github.io/blog/first-post/">

&lt;div class=&quot;epigraph&quot;&gt;
  &lt;blockquote&gt;
    &lt;p&gt;The first principle is that you must not fool yourself — and you are the easiest person to fool.&lt;&#x2F;p&gt;
    
    &lt;footer&gt;
      Richard Feynman
      
      
    &lt;&#x2F;footer&gt;
    
  &lt;&#x2F;blockquote&gt;
&lt;&#x2F;div&gt;


&lt;div class=&quot;epigraph&quot;&gt;
  &lt;blockquote&gt;
    &lt;p&gt;It is a capital mistake to theorize before one has data.&lt;&#x2F;p&gt;
    
  &lt;&#x2F;blockquote&gt;
&lt;&#x2F;div&gt;


&lt;div class=&quot;epigraph&quot;&gt;
  &lt;blockquote&gt;
    &lt;p&gt;I have made this longer than usual because I have not had time to make it shorter.&lt;&#x2F;p&gt;
    
    &lt;footer&gt;
      Blaise Pascal
      , 
      Lettres Provinciales
    &lt;&#x2F;footer&gt;
    
  &lt;&#x2F;blockquote&gt;
&lt;&#x2F;div&gt;
&lt;hr &#x2F;&gt;
&lt;h2 id=&quot;quotes&quot;&gt;Quotes&lt;&#x2F;h2&gt;


&lt;blockquote&gt;
  &lt;p&gt;Graphical excellence is the well-designed presentation of interesting data.&lt;&#x2F;p&gt;
  
  &lt;footer&gt;
    Edward Tufte
    , 
    &lt;cite&gt;The Visual Display of Quantitative Information
  &lt;&#x2F;footer&gt;
  
&lt;&#x2F;blockquote&gt;
&lt;p&gt;This is exactly what Orwell had in mind when he wrote:&lt;&#x2F;p&gt;


&lt;blockquote&gt;
  &lt;p&gt;Never use a long word where a short one will do.&lt;&#x2F;p&gt;
  
  &lt;footer&gt;
    George Orwell
    , 
    &lt;cite&gt;Politics and the English Language
  &lt;&#x2F;footer&gt;
  
&lt;&#x2F;blockquote&gt;
&lt;p&gt;And that principle applies equally to code.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;side-notes-and-margin-notes&quot;&gt;Side notes and margin notes&lt;&#x2F;h2&gt;
&lt;p&gt;This is the &lt;code&gt;simd&lt;&#x2F;code&gt; first post.&lt;label for=&quot;sn-sn-1&quot; class=&quot;margin-toggle sidenote-number&quot;&gt;&lt;&#x2F;label&gt;
&lt;input type=&quot;checkbox&quot; id=&quot;sn-sn-1&quot; class=&quot;margin-toggle&quot;&#x2F;&gt;
&lt;span class=&quot;sidenote&quot;&gt;You can use all your shortcodes here too.&lt;&#x2F;span&gt;
&lt;&#x2F;p&gt;
&lt;h2 id=&quot;full-width-stuff&quot;&gt;Full-width stuff&lt;&#x2F;h2&gt;
&lt;pre&gt;&lt;code data-lang=&quot;c++&quot;&gt;int main(void)
{
  std::cout &amp;lt;&amp;lt; &amp;quot;Hello, World\n&amp;quot;;
}
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;div class=&quot;fullwidth&quot;&gt;
&lt;pre&gt;&lt;code data-lang=&quot;python&quot;&gt;# a wide block that needs more horizontal space
result = some_very_long_function_name(argument_one, argument_two, argument_three) 
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;&#x2F;div&gt;
&lt;h2 id=&quot;alert-environments&quot;&gt;Alert environments&lt;&#x2F;h2&gt;
&lt;blockquote class=&quot;markdown-alert-note&quot;&gt;
&lt;p&gt;Beware of double pointers!&lt;&#x2F;p&gt;
&lt;p&gt;hello world is&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;c&quot;&gt;int main(void) { ... }
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;and then later it is note as ...&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;blockquote class=&quot;markdown-alert-tip&quot;&gt;
&lt;p&gt;A helpful tip for the reader.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;blockquote class=&quot;markdown-alert-important&quot;&gt;
&lt;p&gt;Something the reader really should not miss.
This is the &lt;code&gt;simd&lt;&#x2F;code&gt; first post.&lt;label for=&quot;sn-sn-2&quot; class=&quot;margin-toggle sidenote-number&quot;&gt;&lt;&#x2F;label&gt;
&lt;input type=&quot;checkbox&quot; id=&quot;sn-sn-2&quot; class=&quot;margin-toggle&quot;&#x2F;&gt;
&lt;span class=&quot;sidenote&quot;&gt;You can use all your shortcodes here too.&lt;&#x2F;span&gt;
&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;blockquote class=&quot;markdown-alert-warning&quot;&gt;
&lt;p&gt;A warning about potential pitfalls.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;blockquote class=&quot;markdown-alert-caution&quot;&gt;
&lt;p&gt;A strong caution about destructive or irreversible actions.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;To use Tufte CSS, copy tufte.css and the et-book directory of font files to your project directory, then add the following to your HTML document’s head block:&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code data-lang=&quot;html&quot;&gt;&amp;lt;link rel=&amp;quot;stylesheet&amp;quot; href=&amp;quot;tufte.css&amp;quot;&#x2F;&amp;gt;
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h2 id=&quot;math-support&quot;&gt;Math support&lt;&#x2F;h2&gt;
&lt;p&gt;$$
x^2
\frac{1}{x}
12\,\rm{m}^2
$$&lt;&#x2F;p&gt;
&lt;p&gt;Inline math: 

&lt;span class=&quot;math&quot;&gt;\(\frac{1}{2}\)&lt;&#x2F;span&gt;
&lt;&#x2F;p&gt;
&lt;p&gt;Display math:


&lt;div class=&quot;math-display&quot;&gt;\[\int_0^\infty e^{-x^2} dx = \frac{\sqrt{\pi}}{2}
12\,\rm{m}^2\]&lt;&#x2F;div&gt;
&lt;&#x2F;p&gt;
&lt;p&gt;...&lt;&#x2F;p&gt;


&lt;div class=&quot;math-display&quot;&gt;\[x^2
\frac{1}{x}
12\,\rm{m}^2\]&lt;&#x2F;div&gt;
&lt;p&gt;...&lt;&#x2F;p&gt;
&lt;div class=&quot;math-display&quot;&gt;\[\int_0^\infty...\]&lt;&#x2F;div&gt;
</content>
        
    </entry>
</feed>
