GuideSlack · macOS · accessibilitymain.swift:1424

The Slack MCP Server Without An OAuth Token: Driving The Desktop App Instead Of The API

Every single result on the first page of Google for "slack mcp server" is an API wrapper: Slack's official server, korotovsky/slack-mcp-server, Workato, PulseMCP, apidog, mcp.so. They all call search.messages and chat.postMessage and they all need an admin to approve an OAuth app in Workspace Settings. macos-use is the one that never touches Slack's API. It drives the macOS Slack desktop app through the accessibility tree, and the server literally ships a Slack-specific usage example baked into Sources/MCPServer/main.swift:1424 that gets sent to Claude Desktop on every initialize handshake.

M
Matthew Diakonov
10 min read
5.0from open source
Zero OAuth tokens, zero REST calls, zero admin approvals
One JSON-RPC request becomes click + type + Return inside Slack.app
Works on Enterprise Grid and GovSlack because there is no HTTP path to block

The SERP Is Unanimous. It Is Also Only One Half Of The Answer.

Search for "slack mcp server" and every result on the first page describes the same architecture. A server process holds a Slack OAuth token, exposes MCP tools named slack_search_messages and slack_post_message, and translates those tool calls into HTTPS requests against slack.com/api/. It is a fine architecture. It has one structural cost: the Slack admin has to enable MCP for your workspace and approve the app. On Enterprise Grid and GovSlack, and on any corporate workspace where IT has not gotten around to it yet, that gate is shut.

There is a second way to build a Slack MCP server that nobody on the first page mentions. You can drive the Slack desktop app the same way a human driver does. On macOS, every app exposes an accessibility tree. Slack's tree has a searchable element labeled "Message to <channel>" on the compose field. If you can find that element, click it, type into it, and press Return, you have sent a message. No token, no rate limit, no app review.

macos-use is the second way. This guide is about the six-line shape of what it actually does when the model sends a Slack message, and the one exact line in main.swift that makes the model know to do it that way.

Two Architectures For The Same Tool Name

The tool the AI client sees is similar enough that you can swap between them. The mechanism underneath is completely different. Here is the flip.

Slack MCP server: API-based vs. desktop-driven

A hosted or self-hosted HTTP server holds an OAuth token for your workspace and calls Slack's REST API on the model's behalf. chat.postMessage for sending, search.messages for reading. Everything flows over HTTPS.

  • Admin has to approve the MCP integration in Workspace Settings
  • OAuth token with scopes: chat:write, search:read, channels:read
  • Rate limits apply (Tier 3 = 50+ req/min)
  • Does not work on workspaces where admin has not enabled MCP
  • Search covers the full history, not just what is on screen
Anchor fact 1 of 2

The Slack Example Is Inside The Server's Own Instructions String

On the MCP spec, an instructions field on the Server constructor is optional guidance that clients ship to the model. macos-use uses it to teach the model one specific thing: how to compose actions. The only per-app example in the entire block is Slack. Clone the repo, grep -n "Slack message box" Sources/, and you get exactly one hit at main.swift:1424. That sentence is the training signal the server hands every MCP client on connect.

Sources/MCPServer/main.swift:1411-1437

Other MCP Servers Named "slack", Grouped By What They Actually Do

Six of the seven results on the first Google page describe the same class of server. Here they are in a row.

Slack Official MCPhosted REST API wrapper
korotovsky/slack-mcp-serverself-hosted REST API wrapper
Workato Slack MCPiPaaS / REST
PulseMCP Slackdirectory listing of REST wrappers
apidog guideREST tutorial
mcp.so/server/slackREST directory
macos-useaccessibility tree, no REST

Six of seven sit on the REST API. One sits on the accessibility tree.

What One JSON-RPC Request Becomes Inside The Server

The sequence below is the full client-server-app exchange for sending one Slack message. The client sends one callTool; the server turns it into three OS-level effects and one final traversal. The AI model never sees the intermediate state, which is why the server is responsible for orchestrating the chain, not the client.

One callTool → click + type + Return in Slack.app

AI Clientmcp-server-macos-useSlack.appdiskcallTool(click_and_traverse)InputGuard.engage() blocks human inputfind AXTextArea 'Message to standups'x:412 y:1034 w:1180 h:40CGEvent click at (1002, 1054)type 'standup in 5 min…' keystrokespressKey Returntraverse AX tree again412 elements incl. new messagewrite <ts>_click_and_traverse.txt + .pngInputGuard.disengage(), cursor restoredcompact summary (file, screenshot, status)

Where Each Hop Goes

AnimatedBeam view of the same exchange. The AI client on the left writes one request. macos-use in the middle does the orchestration. Slack.app on the right receives three synthetic events, exactly as if a human had clicked, typed, and pressed Return.

client → server → Slack.app, one round trip

Claude Desktop
Cursor
VS Code MCP
mcp-server-macos-use
Slack compose
Keystrokes
Return
AX tree + PNG

The Numbers You Can Reproduce From The Current Commit

Clone github.com/mediar-ai/mcp-server-macos-use, run wc -l and grep -c against the files, and these numbers fall out. Nothing is tuned at runtime.

0line where the Slack example lives (main.swift:1424)
0MCP tools exposed over JSON-RPC
0hardware event types blocked during a send
0mssleep between click → type → Return
0
OAuth tokens required to send a Slack message
0
JSON-RPC callTool request per message sent
0
OS-level effects: click, type, Return
0
keycode for Esc (the universal cancel)
0
seconds before the InputGuard watchdog auto-releases
0
total lines in Sources/MCPServer/main.swift
Anchor fact 2 of 2

One Request On The Wire, Three Effects Inside Slack

The JSON-RPC payload below is the verbatim shape of what the AI client sends. The Swift block underneath is the code path at main.swift:1709-1741 that splits it into three OS-level effects with an InputGuard.throwIfCancelled check at every boundary. If you press Esc between steps, the chain aborts cleanly and neither the type nor the Return has landed.

client → server, over stdio
Sources/MCPServer/main.swift:1709-1741

Eight Stages From callTool To The Message In The Channel

Each step below is one line-range in the source. Nothing is abstract; everything is grepable.

1

Client serializes callTool over stdio

Claude Desktop writes one newline-delimited JSON-RPC frame to the server's stdin. method = callTool, name = macos-use_click_and_traverse, arguments contain pid, element substring, text body, and pressKey.

2

Server engages InputGuard, saves cursor and frontmost app

main.swift:1672-1682 saves the current NSWorkspace frontmostApplication and CGEvent.mouseCursorPosition. InputGuard.shared.engage() installs a .cghidEventTap (InputGuard.swift:113) that blocks 11 hardware event types from the human until the tool returns.

3

Element matcher walks Slack's accessibility tree

main.swift:1054 walks the AXUIElement tree of PID 4381. It matches any element whose AX text, description, or label contains "Message to standups". Slack renders the compose field as an AXTextArea with that exact label, so one match, one set of (x, y, w, h).

4

CGEvent.post fires a synthetic click at the center

main.swift:1574 computes (x + w/2, y + h/2) = (1002, 1054). CGEvent.post(tap: .cghidEventTap) sends mouseDown + mouseUp with .hidSystemState source (non-zero sourceStateID), so the tap callback at InputGuard.swift:329 lets it through while still blocking the human.

5

Type path posts keystrokes, then Return

The composed chain at main.swift:1726-1733 sleeps 100ms, posts each character of "standup in 5 min, room 3A" as synthetic keyDown + keyUp pairs, sleeps 100ms, posts Return. Slack's React compose handler sees native key events indistinguishable from a real user typing.

6

Final traversal captures the new message in the tree

main.swift:1737-1741 runs a fresh traverseAccessibilityTree on Slack's PID. The new AXStaticText line containing the message body now appears in the scrollable message list. The tree is written to /tmp/macos-use/<ts>_click_and_traverse.txt.

7

Screenshot subprocess snaps the Slack window

main.swift:435-510 spawns the sibling `screenshot-helper` binary with the Slack window bounds and an optional (1002, 1054) crosshair. The helper writes the PNG and exits; ReplayKit dies with the subprocess instead of spinning at 19% CPU in the long-lived server.

8

InputGuard disengages, cursor and frontmost app are restored

The 11-event tap is torn down at InputGuard.swift:109. The saved cursor position is re-posted via CGEvent mouseMoved. If Slack was not originally frontmost, prevApp.activate([]) returns focus to whatever app the human was in. The compact summary is serialized and written to stdout.

Watch One Send, End To End, In stderr

The server writes a step-by-step log line for every action. The grep at the end verifies the new message is actually in Slack's accessibility tree, not just assumed to be there.

One callTool → standup message lands in #standups

Against The API-Based Slack MCP Servers

FeatureAPI-based Slack MCP serversmacos-use (desktop-driven)
AuthenticationOAuth token with scopes (chat:write, search:read, channels:read)macOS Accessibility permission once, per your user
Needs admin approvalYes. Workspace admin enables MCP in Workspace Settings.No. Inherits whatever you can do in Slack manually.
Works on Enterprise Grid / GovSlack without admin enablementNoYes. There is no HTTP path an admin could block.
Rate limitsSlack API tier limits (Tier 3 = 50+ req/min)Only UI reactivity; ~100ms between steps is plenty
Search history depthFull indexed history via search.messagesOnly what is visible; scroll to load more
Runs whereAny HTTP hostOnly the one Mac where Slack.app is signed in
Slack-specific code in the serverDozens of endpoint wrappersOne example line at main.swift:1424
What crosses the networkEvery tool call becomes HTTPS to slack.com/api/Nothing. All events stay inside the one Mac.
1 line

Example: to type into a Slack message box and send it, use ONE click_and_traverse call with element="Message to X", text="hello", pressKey="Return". Do NOT split into separate click, type, and press calls.

Sources/MCPServer/main.swift:1424, inside the server's MCP instructions string

Five Slack Things You Can Do Without A Token

Each of these is one MCP tool call against the running Slack desktop app. No per-action code. No per-channel config.

Send a message to a channel

One click_and_traverse with element="Message to #channel", text=body, pressKey="Return". The line that tells the model to compose it this way is main.swift:1424.

Reply in a thread

Click the message to open the thread pane, then click_and_traverse with element="Reply to" and text, pressKey="Return".

Add an emoji reaction

click_and_traverse the message to hover, then click_and_traverse element="Add reaction", then type the shortcode, Return.

Read the last 20 messages in a DM

One refresh_traversal call dumps the visible AX tree. grep the /tmp/macos-use/<ts>.txt for AXStaticText lines.

Jump to a workspace with Cmd-Option-N

press_key_and_traverse with keys=["cmd","option","1"] switches to the first workspace and re-captures the tree.

Send to a new DM from quick switcher

press Cmd-K, type the name, Return, then click_and_traverse element="Message to …" with your body.

Try It On Your Own Machine

git clone https://github.com/mediar-ai/mcp-server-macos-use cd mcp-server-macos-use xcrun --toolchain com.apple.dt.toolchain.XcodeDefault swift build -c release # Confirm the Slack-specific example is real, in one line: grep -n "Slack message box" Sources/MCPServer/main.swift # → 1424: - Example: to type into a Slack message box and send it, … # Point your MCP client at .build/release/mcp-server-macos-use # Grant macOS Accessibility permission on first run. # Open Slack. Ask the model: "send 'hi' to #random". # Watch the compact summary and the tree file: ls -lt /tmp/macos-use/ | head -3 # Confirm the message actually landed: grep -n "hi" /tmp/macos-use/*_click_and_traverse.txt | tail -3

Frequently Asked Questions

Frequently asked questions

Is this the same thing as Slack's official MCP server?

No. Slack's official MCP server, launched in 2026, is a hosted server at slack.dev that calls Slack's REST API on your behalf (search.messages, chat.postMessage, users.list, and so on). It needs an admin to approve the MCP integration in Workspace Settings and issues an OAuth token scoped to that approval. macos-use is the opposite approach: it is a local Swift binary that drives the macOS Slack desktop app through the Accessibility APIs. No token, no admin approval, no REST calls. It works for any workspace you are already signed into on your Mac, including Enterprise Grid and workspaces where the admin has not enabled the official MCP.

How does the server know how to send a Slack message?

The hint is in the server's own instructions string, sent to the MCP client on every initialize handshake. Look at Sources/MCPServer/main.swift:1424. The exact line reads: `Example: to type into a Slack message box and send it, use ONE click_and_traverse call with element="Message to X", text="hello", pressKey="Return"`. That sentence is the only per-app example in the entire instructions field. Every MCP client (Claude Desktop, Cursor, VS Code, Cline) receives that text when it connects. The model reads it and chains click + type + Return into a single JSON-RPC call instead of three separate ones.

What does "Message to X" refer to?

That is the literal accessibility label Slack's desktop app attaches to the compose input in a channel or DM. Slack renders it as "Message to #general" for channels and "Message to Alice" for DMs. macos-use's element matcher finds elements whose AX text or AX description contains that substring, reads their x/y/w/h from the accessibility tree, and auto-centers the click at (x+w/2, y+h/2) per main.swift:1574. The model never needs to estimate coordinates from a screenshot, which is explicitly forbidden by the instructions at main.swift:1429.

What JSON-RPC arguments actually get sent for a Slack message?

Exactly one callTool request. The method name is macos-use_click_and_traverse. The arguments are { pid: <Slack pid>, element: "Message to general", text: "standup in 5", pressKey: "Return" }. The server gets the Slack PID once via NSWorkspace.shared.runningApplications (or a prior open_application_and_traverse call). The composed click → type → press is executed in order on the main run loop at main.swift:1709-1741, with a 100ms sleep between steps and an InputGuard throwIfCancelled check at every boundary. One round trip, three OS-level effects.

Does this need Slack's API, admin approval, or a bot user?

No. This is the point of the page. The macOS Accessibility framework lets any app with the Accessibility permission read and synthesize events against any other app on the machine. Slack's AXUIElement tree is exposed to the OS the same way Safari's or Mail's is. There is no HTTP, no OAuth, no scopes, no rate limits. If you can click in Slack on your laptop, macos-use can click in Slack on your laptop. The downside: this only works on the one Mac where the MCP client runs, and only while Slack is open and signed in there.

What stops the model from typing in the wrong Slack channel?

Three things. First, the element parameter is a substring match against the accessibility tree of the target PID, so the model has to name the channel: `element: "Message to #standups"` targets that channel, not whatever is open. Second, every disruptive tool call engages InputGuard.swift, which shows a full-screen overlay with a pulsing orange dot and a "press Esc to cancel" hint for the duration. Third, plain Esc (keycode 53, no modifiers) is hard-wired as a cancel key at InputGuard.swift:345, cancels anywhere on the OS, and writes /tmp/macos-use/esc_pressed.txt as a ground-truth marker. You can always verify the cancel landed.

What does the server return after sending a Slack message?

A compact text summary (built at main.swift:731) plus a pair of files on disk under /tmp/macos-use/. The .txt file is the flat accessibility tree of Slack after the Return was pressed; each line is `[AXRole] "text" x:N y:N w:W h:H visible`. The .png file is a screenshot of the Slack window, produced by a sibling binary called screenshot-helper so that ReplayKit's persistent ~19% CPU cost dies with the subprocess (Sources/MCPServer/main.swift:435-510). The client uses grep on the .txt to confirm the message now appears in the channel and optionally reads the .png to eyeball the result.

How does it pick between two open Slack workspaces?

macOS runs each signed-in Slack workspace as a separate window inside one Slack.app process, not as separate processes. macos-use's window matcher at main.swift:393-425 gets every on-screen window owned by the Slack PID, filters to layer 0, and if a traversalWindowBounds was captured (typically from an open_application_and_traverse call that activated the intended workspace), scores each candidate by intersection overlap and picks the highest-scoring window. Practically, the way to disambiguate is to open the target workspace first with Cmd-Option-1/2/3, then call open_application_and_traverse, then send the message. Overlap scoring does the rest.

Can it search old Slack messages the way the official MCP can?

Only what is visible in the Slack UI. macos-use does not call search.messages. To read past messages, the model scrolls the message list (macos-use_scroll_and_traverse), reads the resulting accessibility tree from /tmp/macos-use/<timestamp>_scroll.txt, and greps for the phrase. This is fast for recent history but becomes slower than the REST API for deep search. For a cross-workspace full-history index, use Slack's official MCP. For single-workspace interactive use where you do not have or want an API token, the desktop approach is more than enough.

Does it work on Slack for Enterprise Grid or GovSlack?

Yes. Both. macos-use sees Slack.app as a normal macOS accessibility tree. There is no HTTP path that an admin could block. The Enterprise Grid workspace picker, the per-org DM lists, the connected-workspace channels all render through the same AXUIElement hierarchy the OS exposes for any signed-in workspace. The official MCP at docs.slack.dev/ai/slack-mcp-server explicitly requires admin enablement per workspace; macos-use inherits whatever permissions the human user already has in Slack.

What happens if Slack is not running when the tool is called?

Call macos-use_open_application_and_traverse with bundleId com.tinyspeck.slackmacgap (or name "Slack") first. The server activates the app, waits 500ms for it to become frontmost, captures the initial accessibility tree and screenshot, and returns the PID you will use for the following click_and_traverse. If the app is already running, the open call just activates it and re-reads the tree; it is idempotent. After the open, the model has a PID and a baseline, and the next call can do the compose-and-send chain in one request.

What about multi-line messages with Shift-Return?

Use the `text` parameter for the body, then pressKey. The type path at main.swift posts CGEvent keystrokes with .hidSystemState so InputGuard lets them through (main.swift:1709-1733). Newlines inside `text` are typed as raw newline keystrokes, which Slack interprets as Shift-Enter because Slack's compose field is in rich-text mode and Return is the send key. If you want a truly multi-line draft, pass the body with embedded \n characters and pressKey="Return" at the end. The composed path waits 100ms between actions so Slack's React input has time to reconcile.

Is there anything that distinguishes this repo from a generic macOS AX wrapper?

Two things specifically. First, the Slack example literally lives in the server's MCP instructions string at main.swift:1424, so every MCP client that connects is told about Slack by name. Second, the combined click + type + pressKey contract at main.swift:1709-1741 means a Slack send is one JSON-RPC round trip, not three. A naive wrapper would expose separate click, type, and keypress tools and leave the orchestration to the model; macos-use teaches the model to chain them, then enforces the chain on the server side with per-step InputGuard checks and a shared pid_t traversal context.

Read The One Line That Makes This A Slack Server Instead Of A Generic AX Wrapper

main.swift:1424 is a single sentence inside the server's MCP instructions. It is what every AI client sees on initialize. Forking that line is how you teach the server about a second app.

Open main.swift:1424 on GitHub
macos-useMCP server for native macOS control
© 2026 macos-use. All rights reserved.