Adopt Android detached runtime recovery APIs

Use the updated framework detached-runtime and typed detached-control APIs via reflection compat, surface target runtime in the Agent UI, and switch the Genie detached flow to framework-owned recovery plus smaller detached-frame payloads so HOME sessions survive detached launch and frame capture on the new framework surface.

Co-authored-by: Codex <noreply@openai.com>
This commit is contained in:
Iliyan Malchev
2026-03-23 18:35:02 -07:00
parent 7ad5e4eff3
commit 9b0d7a5271
12 changed files with 640 additions and 35 deletions

View File

@@ -298,6 +298,7 @@ class AgentFrameworkToolBridge(
.put("state", session.stateLabel)
.put("targetDetached", session.targetDetached)
.put("targetPresentation", session.targetPresentationLabel)
.put("targetRuntime", session.targetRuntimeLabel)
.put(
"requiredFinalPresentation",
session.requiredFinalPresentationPolicy?.wireValue,

View File

@@ -7,6 +7,7 @@ import android.content.Context
import android.os.Binder
import android.os.Process
import android.util.Log
import com.openai.codex.bridge.DetachedTargetCompat
import com.openai.codex.bridge.FrameworkSessionTransportCompat
import com.openai.codex.bridge.SessionExecutionSettings
import java.util.concurrent.Executor
@@ -76,6 +77,7 @@ class AgentSessionController(context: Context) {
presentationPolicyStore.prunePolicies(sessions.map { it.sessionId }.toSet())
executionSettingsStore.pruneSettings(sessions.map { it.sessionId }.toSet())
var sessionDetails = sessions.map { session ->
val targetRuntime = DetachedTargetCompat.getTargetRuntime(session)
AgentSessionDetails(
sessionId = session.sessionId,
parentSessionId = session.parentSessionId,
@@ -85,6 +87,8 @@ class AgentSessionController(context: Context) {
stateLabel = stateToString(session.state),
targetPresentation = session.targetPresentation,
targetPresentationLabel = targetPresentationToString(session.targetPresentation),
targetRuntime = targetRuntime.value,
targetRuntimeLabel = targetRuntime.label,
targetDetached = session.isTargetDetached,
requiredFinalPresentationPolicy = presentationPolicyStore.getPolicy(session.sessionId),
latestQuestion = null,
@@ -665,6 +669,8 @@ data class AgentSessionDetails(
val stateLabel: String,
val targetPresentation: Int,
val targetPresentationLabel: String,
val targetRuntime: Int?,
val targetRuntimeLabel: String,
val targetDetached: Boolean,
val requiredFinalPresentationPolicy: SessionFinalPresentationPolicy?,
val latestQuestion: String?,

View File

@@ -21,6 +21,7 @@ object SessionContinuationPromptBuilder {
selectedSession.targetPackage?.let { appendLine("- Target package: $it") }
appendLine("- Previous state: ${selectedSession.stateLabel}")
appendLine("- Previous presentation: ${selectedSession.targetPresentationLabel}")
appendLine("- Previous runtime: ${selectedSession.targetRuntimeLabel}")
selectedSession.latestResult
?.takeIf(String::isNotBlank)
?.let { appendLine("- Previous result: ${it.take(MAX_DETAIL_CHARS)}") }

View File

@@ -53,6 +53,7 @@ object SessionUiFormatter {
session.targetPackage?.let { append(" ($it)") }
append("\nState: ${session.stateLabel}\n")
append("Target presentation: ${session.targetPresentationLabel}\n")
append("Target runtime: ${session.targetRuntimeLabel}\n")
session.requiredFinalPresentationPolicy?.let { policy ->
append("Required final presentation: ${policy.wireValue}\n")
}

View File

@@ -15,7 +15,7 @@ This Codex runtime is operating on an Android device through the Agent Platform.
- You are paired with exactly one target app sandbox for this session.
- Solve the delegated objective inside that sandbox by using the normal Codex tool path and the Android tools that are available on-device.
- Ask the Agent a concise free-form question only when you are blocked on missing intent, missing constraints, or a framework-owned action.
- Do not assume you can reach the internet directly. Model and auth traffic are Agent-owned.
- Do not assume you can reach the internet directly. Live session model traffic is framework-owned, and auth material originates from the Agent.
- Do not rely on direct cross-app `bindService(...)` or raw local sockets to reach the Agent. Use the framework-managed session bridge.
## Shell and device tooling
@@ -37,6 +37,7 @@ This Codex runtime is operating on an Android device through the Agent Platform.
- Detached launch, shown-detached, and attached are different states.
- `targetDetached=true` means the target is still detached even if it is visible in a detached or mirrored presentation.
- If the framework launched the target detached for you, treat that launch as authoritative. Do not relaunch the target package with plain shell launchers such as `am start`, `cmd activity start-activity`, or `monkey -p`; use framework target controls plus UI inspection/input instead.
- If the detached target disappears or the framework reports a missing detached target, use the framework recovery primitive first (`android_target_ensure_hidden`) instead of ordinary app launch.
- If the delegated objective specifies a required final target presentation such as `ATTACHED`, `DETACHED_HIDDEN`, or `DETACHED_SHOWN`, treat that as a hard completion requirement and do not claim success until the framework session matches it.
- If the task says the app should be visible to the user, do not claim success until the target is attached unless the task explicitly allows detached presentation.
- If the user asks to show an activity on the screen, the Genie must explicitly make its display visible. Launching hidden or leaving the target detached is not enough.

View File

@@ -0,0 +1,473 @@
package com.openai.codex.bridge
import android.app.agent.AgentSessionInfo
import android.app.agent.GenieService
import android.window.ScreenCapture
import java.lang.reflect.Field
import java.lang.reflect.InvocationTargetException
import java.lang.reflect.Method
import java.lang.reflect.Modifier
object DetachedTargetCompat {
private const val METHOD_GET_TARGET_RUNTIME = "getTargetRuntime"
private const val METHOD_ENSURE_DETACHED_TARGET_HIDDEN = "ensureDetachedTargetHidden"
private const val METHOD_SHOW_DETACHED_TARGET = "showDetachedTarget"
private const val METHOD_HIDE_DETACHED_TARGET = "hideDetachedTarget"
private const val METHOD_ATTACH_DETACHED_TARGET = "attachDetachedTarget"
private const val METHOD_CLOSE_DETACHED_TARGET = "closeDetachedTarget"
private const val METHOD_CAPTURE_DETACHED_TARGET_FRAME_RESULT = "captureDetachedTargetFrameResult"
private const val METHOD_GET_STATUS = "getStatus"
private const val METHOD_GET_DETACHED_DISPLAY_ID = "getDetachedDisplayId"
private const val METHOD_GET_MESSAGE = "getMessage"
private const val TARGET_RUNTIME_NONE_LABEL = "TARGET_RUNTIME_NONE"
private const val TARGET_RUNTIME_ATTACHED_LABEL = "TARGET_RUNTIME_ATTACHED"
private const val TARGET_RUNTIME_DETACHED_LAUNCHING_LABEL = "TARGET_RUNTIME_DETACHED_LAUNCHING"
private const val TARGET_RUNTIME_DETACHED_HIDDEN_LABEL = "TARGET_RUNTIME_DETACHED_HIDDEN"
private const val TARGET_RUNTIME_DETACHED_SHOWN_LABEL = "TARGET_RUNTIME_DETACHED_SHOWN"
private const val TARGET_RUNTIME_MISSING_LABEL = "TARGET_RUNTIME_MISSING"
private const val STATUS_OK_LABEL = "STATUS_OK"
private const val STATUS_NO_DETACHED_DISPLAY_LABEL = "STATUS_NO_DETACHED_DISPLAY"
private const val STATUS_NO_TARGET_TASK_LABEL = "STATUS_NO_TARGET_TASK"
private const val STATUS_LAUNCH_FAILED_LABEL = "STATUS_LAUNCH_FAILED"
private const val STATUS_INTERNAL_ERROR_LABEL = "STATUS_INTERNAL_ERROR"
private const val STATUS_CAPTURE_FAILED_LABEL = "STATUS_CAPTURE_FAILED"
data class DetachedTargetState(
val value: Int?,
val label: String,
) {
fun isMissing(): Boolean = label == TARGET_RUNTIME_MISSING_LABEL
}
data class DetachedTargetControlResult(
val status: Int?,
val statusLabel: String,
val targetRuntime: DetachedTargetState,
val detachedDisplayId: Int?,
val message: String?,
) {
fun isOk(): Boolean = statusLabel == STATUS_OK_LABEL
fun needsRecovery(): Boolean {
return statusLabel == STATUS_NO_DETACHED_DISPLAY_LABEL ||
statusLabel == STATUS_NO_TARGET_TASK_LABEL ||
targetRuntime.isMissing()
}
fun summary(action: String): String {
return buildString {
append("Detached target ")
append(action)
append(" -> ")
append(statusLabel)
append(" (runtime=")
append(targetRuntime.label)
detachedDisplayId?.let { displayId ->
append(", display=")
append(displayId)
}
append(")")
message?.takeIf(String::isNotBlank)?.let { detail ->
append(": ")
append(detail)
}
}
}
}
data class DetachedTargetCaptureResult(
val status: Int?,
val statusLabel: String,
val targetRuntime: DetachedTargetState,
val detachedDisplayId: Int?,
val message: String?,
val captureResult: ScreenCapture.ScreenCaptureResult?,
) {
fun isOk(): Boolean = statusLabel == STATUS_OK_LABEL && captureResult != null
fun needsRecovery(): Boolean {
return statusLabel == STATUS_NO_DETACHED_DISPLAY_LABEL ||
statusLabel == STATUS_NO_TARGET_TASK_LABEL ||
targetRuntime.isMissing()
}
fun summary(): String {
return buildString {
append("Detached target capture -> ")
append(statusLabel)
append(" (runtime=")
append(targetRuntime.label)
detachedDisplayId?.let { displayId ->
append(", display=")
append(displayId)
}
append(")")
message?.takeIf(String::isNotBlank)?.let { detail ->
append(": ")
append(detail)
}
}
}
}
private val targetRuntimeLabels: Map<Int, String> by lazy(LazyThreadSafetyMode.SYNCHRONIZED) {
staticIntFields(AgentSessionInfo::class.java, "TARGET_RUNTIME_")
}
private val getTargetRuntimeMethod: Method? by lazy(LazyThreadSafetyMode.SYNCHRONIZED) {
findOptionalMethod(AgentSessionInfo::class.java, METHOD_GET_TARGET_RUNTIME)
}
fun getTargetRuntime(sessionInfo: AgentSessionInfo): DetachedTargetState {
val runtimeValue = getTargetRuntimeMethod?.let { method ->
invokeChecked { method.invoke(sessionInfo) as? Int }
}
if (runtimeValue != null) {
return DetachedTargetState(
value = runtimeValue,
label = targetRuntimeLabels[runtimeValue] ?: runtimeValue.toString(),
)
}
return when {
sessionInfo.targetPresentation == AgentSessionInfo.TARGET_PRESENTATION_DETACHED_HIDDEN -> {
DetachedTargetState(
value = null,
label = TARGET_RUNTIME_DETACHED_HIDDEN_LABEL,
)
}
sessionInfo.targetPresentation == AgentSessionInfo.TARGET_PRESENTATION_DETACHED_SHOWN -> {
DetachedTargetState(
value = null,
label = TARGET_RUNTIME_DETACHED_SHOWN_LABEL,
)
}
sessionInfo.isTargetDetached -> {
DetachedTargetState(
value = null,
label = TARGET_RUNTIME_DETACHED_LAUNCHING_LABEL,
)
}
sessionInfo.targetPackage != null -> {
DetachedTargetState(
value = null,
label = TARGET_RUNTIME_ATTACHED_LABEL,
)
}
else -> DetachedTargetState(
value = null,
label = TARGET_RUNTIME_NONE_LABEL,
)
}
}
fun ensureDetachedTargetHidden(
callback: GenieService.Callback,
sessionId: String,
): DetachedTargetControlResult {
return invokeControl(
callback = callback,
sessionId = sessionId,
methodName = METHOD_ENSURE_DETACHED_TARGET_HIDDEN,
legacyFallback = {
callback.requestLaunchDetachedTargetHidden(sessionId)
DetachedTargetControlResult(
status = null,
statusLabel = STATUS_OK_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_DETACHED_HIDDEN_LABEL,
),
detachedDisplayId = null,
message = "Used legacy detached launch callback.",
)
},
)
}
fun showDetachedTarget(
callback: GenieService.Callback,
sessionId: String,
): DetachedTargetControlResult {
return invokeControl(
callback = callback,
sessionId = sessionId,
methodName = METHOD_SHOW_DETACHED_TARGET,
legacyFallback = {
callback.requestShowDetachedTarget(sessionId)
DetachedTargetControlResult(
status = null,
statusLabel = STATUS_OK_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_DETACHED_SHOWN_LABEL,
),
detachedDisplayId = null,
message = "Used legacy detached show callback.",
)
},
)
}
fun hideDetachedTarget(
callback: GenieService.Callback,
sessionId: String,
): DetachedTargetControlResult {
return invokeControl(
callback = callback,
sessionId = sessionId,
methodName = METHOD_HIDE_DETACHED_TARGET,
legacyFallback = {
callback.requestHideDetachedTarget(sessionId)
DetachedTargetControlResult(
status = null,
statusLabel = STATUS_OK_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_DETACHED_HIDDEN_LABEL,
),
detachedDisplayId = null,
message = "Used legacy detached hide callback.",
)
},
)
}
fun attachDetachedTarget(
callback: GenieService.Callback,
sessionId: String,
): DetachedTargetControlResult {
return invokeControl(
callback = callback,
sessionId = sessionId,
methodName = METHOD_ATTACH_DETACHED_TARGET,
legacyFallback = {
callback.requestAttachTarget(sessionId)
DetachedTargetControlResult(
status = null,
statusLabel = STATUS_OK_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_ATTACHED_LABEL,
),
detachedDisplayId = null,
message = "Used legacy target attach callback.",
)
},
)
}
fun closeDetachedTarget(
callback: GenieService.Callback,
sessionId: String,
): DetachedTargetControlResult {
return invokeControl(
callback = callback,
sessionId = sessionId,
methodName = METHOD_CLOSE_DETACHED_TARGET,
legacyFallback = {
callback.requestCloseDetachedTarget(sessionId)
DetachedTargetControlResult(
status = null,
statusLabel = STATUS_OK_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_NONE_LABEL,
),
detachedDisplayId = null,
message = "Used legacy detached close callback.",
)
},
)
}
fun captureDetachedTargetFrameResult(
callback: GenieService.Callback,
sessionId: String,
): DetachedTargetCaptureResult {
val method = findOptionalMethod(
callback.javaClass,
METHOD_CAPTURE_DETACHED_TARGET_FRAME_RESULT,
String::class.java,
)
if (method == null) {
val captureResult = callback.captureDetachedTargetFrame(sessionId)
return DetachedTargetCaptureResult(
status = null,
statusLabel = if (captureResult != null) STATUS_OK_LABEL else STATUS_CAPTURE_FAILED_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = if (captureResult != null) {
TARGET_RUNTIME_DETACHED_HIDDEN_LABEL
} else {
TARGET_RUNTIME_NONE_LABEL
},
),
detachedDisplayId = null,
message = if (captureResult != null) {
"Used legacy detached-frame capture callback."
} else {
"Legacy detached-frame capture returned null."
},
captureResult = captureResult,
)
}
val resultObject = invokeChecked {
method.invoke(callback, sessionId)
} ?: return DetachedTargetCaptureResult(
status = null,
statusLabel = STATUS_CAPTURE_FAILED_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_NONE_LABEL,
),
detachedDisplayId = null,
message = "Detached target capture returned null result object.",
captureResult = null,
)
return parseCaptureResult(resultObject)
}
private fun invokeControl(
callback: GenieService.Callback,
sessionId: String,
methodName: String,
legacyFallback: () -> DetachedTargetControlResult,
): DetachedTargetControlResult {
val method = findOptionalMethod(callback.javaClass, methodName, String::class.java)
if (method == null) {
return legacyFallback()
}
val resultObject = invokeChecked {
method.invoke(callback, sessionId)
} ?: return DetachedTargetControlResult(
status = null,
statusLabel = STATUS_INTERNAL_ERROR_LABEL,
targetRuntime = DetachedTargetState(
value = null,
label = TARGET_RUNTIME_NONE_LABEL,
),
detachedDisplayId = null,
message = "$methodName returned null result object.",
)
return parseControlResult(resultObject)
}
private fun parseControlResult(resultObject: Any): DetachedTargetControlResult {
val resultClass = resultObject.javaClass
val status = invokeChecked {
findRequiredMethod(resultClass, METHOD_GET_STATUS).invoke(resultObject) as? Int
}
return DetachedTargetControlResult(
status = status,
statusLabel = statusLabel(resultClass, status),
targetRuntime = parseTargetRuntime(resultObject),
detachedDisplayId = optionalInt(resultObject, METHOD_GET_DETACHED_DISPLAY_ID),
message = optionalString(resultObject, METHOD_GET_MESSAGE),
)
}
private fun parseCaptureResult(resultObject: Any): DetachedTargetCaptureResult {
val resultClass = resultObject.javaClass
val status = invokeChecked {
findRequiredMethod(resultClass, METHOD_GET_STATUS).invoke(resultObject) as? Int
}
val captureGetter = findOptionalMethod(resultClass, "getCaptureResult")
?: findOptionalMethod(resultClass, "getScreenCaptureResult")
val captureResult = captureGetter?.let { method ->
invokeChecked { method.invoke(resultObject) as? ScreenCapture.ScreenCaptureResult }
}
return DetachedTargetCaptureResult(
status = status,
statusLabel = statusLabel(resultClass, status),
targetRuntime = parseTargetRuntime(resultObject),
detachedDisplayId = optionalInt(resultObject, METHOD_GET_DETACHED_DISPLAY_ID),
message = optionalString(resultObject, METHOD_GET_MESSAGE),
captureResult = captureResult,
)
}
private fun parseTargetRuntime(resultObject: Any): DetachedTargetState {
val runtime = optionalInt(resultObject, METHOD_GET_TARGET_RUNTIME)
return if (runtime != null) {
DetachedTargetState(
value = runtime,
label = targetRuntimeLabels[runtime] ?: runtime.toString(),
)
} else {
DetachedTargetState(
value = null,
label = TARGET_RUNTIME_NONE_LABEL,
)
}
}
private fun statusLabel(
resultClass: Class<*>,
status: Int?,
): String {
if (status == null) {
return STATUS_INTERNAL_ERROR_LABEL
}
return staticIntFields(resultClass, "STATUS_")[status] ?: status.toString()
}
private fun optionalInt(
target: Any,
methodName: String,
): Int? {
val method = findOptionalMethod(target.javaClass, methodName) ?: return null
return invokeChecked { method.invoke(target) as? Int }
}
private fun optionalString(
target: Any,
methodName: String,
): String? {
val method = findOptionalMethod(target.javaClass, methodName) ?: return null
return invokeChecked { method.invoke(target) as? String }?.ifBlank { null }
}
private fun staticIntFields(
clazz: Class<*>,
prefix: String,
): Map<Int, String> {
return clazz.fields
.filter(::isStaticIntField)
.filter { field -> field.name.startsWith(prefix) }
.associate { field ->
field.getInt(null) to field.name
}
}
private fun isStaticIntField(field: Field): Boolean {
return Modifier.isStatic(field.modifiers) && field.type == Int::class.javaPrimitiveType
}
private fun findRequiredMethod(
clazz: Class<*>,
name: String,
vararg parameterTypes: Class<*>,
): Method {
return clazz.getMethod(name, *parameterTypes)
}
private fun findOptionalMethod(
clazz: Class<*>,
name: String,
vararg parameterTypes: Class<*>,
): Method? {
return runCatching {
clazz.getMethod(name, *parameterTypes)
}.getOrNull()
}
private fun <T> invokeChecked(block: () -> T): T {
try {
return block()
} catch (err: InvocationTargetException) {
throw err.targetException ?: err
}
}
}

View File

@@ -3,8 +3,11 @@ package com.openai.codex.genie
import android.app.agent.GenieService
import android.graphics.Bitmap
import android.util.Base64
import com.openai.codex.bridge.DetachedTargetCompat
import java.io.ByteArrayOutputStream
import java.io.IOException
import kotlin.math.max
import kotlin.math.roundToInt
import org.json.JSONObject
class AndroidGenieToolExecutor(
@@ -12,6 +15,12 @@ class AndroidGenieToolExecutor(
private val sessionId: String,
) {
companion object {
private const val MAX_CAPTURE_LONG_EDGE = 480
private const val MAX_CAPTURE_JPEG_BYTES = 48 * 1024
private const val INITIAL_JPEG_QUALITY = 65
private const val MIN_CAPTURE_JPEG_QUALITY = 38
const val ENSURE_HIDDEN_TARGET_TOOL = "android_target_ensure_hidden"
const val SHOW_TARGET_TOOL = "android_target_show"
const val HIDE_TARGET_TOOL = "android_target_hide"
const val ATTACH_TARGET_TOOL = "android_target_attach"
@@ -24,21 +33,37 @@ class AndroidGenieToolExecutor(
@Suppress("UNUSED_PARAMETER") arguments: JSONObject,
): GenieToolObservation {
return when (toolName) {
ENSURE_HIDDEN_TARGET_TOOL -> requestTargetVisibility(
action = "ensure hidden",
request = {
DetachedTargetCompat.ensureDetachedTargetHidden(callback, sessionId)
},
attemptRecovery = false,
)
SHOW_TARGET_TOOL -> requestTargetVisibility(
action = "show",
request = callback::requestShowDetachedTarget,
request = {
DetachedTargetCompat.showDetachedTarget(callback, sessionId)
},
)
HIDE_TARGET_TOOL -> requestTargetVisibility(
action = "hide",
request = callback::requestHideDetachedTarget,
request = {
DetachedTargetCompat.hideDetachedTarget(callback, sessionId)
},
)
ATTACH_TARGET_TOOL -> requestTargetVisibility(
action = "attach",
request = callback::requestAttachTarget,
request = {
DetachedTargetCompat.attachDetachedTarget(callback, sessionId)
},
)
CLOSE_TARGET_TOOL -> requestTargetVisibility(
action = "close",
request = callback::requestCloseDetachedTarget,
request = {
DetachedTargetCompat.closeDetachedTarget(callback, sessionId)
},
attemptRecovery = false,
)
CAPTURE_TARGET_FRAME_TOOL -> captureDetachedTargetFrame()
else -> throw IOException("Unknown tool: $toolName")
@@ -47,37 +72,126 @@ class AndroidGenieToolExecutor(
private fun requestTargetVisibility(
action: String,
request: (String) -> Unit,
request: () -> DetachedTargetCompat.DetachedTargetControlResult,
attemptRecovery: Boolean = true,
): GenieToolObservation {
request(sessionId)
val recoveryDetails = mutableListOf<String>()
var result = request()
if (attemptRecovery && result.needsRecovery()) {
val recovery = DetachedTargetCompat.ensureDetachedTargetHidden(callback, sessionId)
recoveryDetails += recovery.summary("ensure hidden")
if (recovery.isOk()) {
result = request()
} else {
throw IOException(
"${result.summary(action)} Recovery failed: ${recovery.summary("ensure hidden")}",
)
}
}
if (!result.isOk()) {
throw IOException(result.summary(action))
}
val promptDetails = buildString {
append(result.summary(action))
recoveryDetails.forEach { detail ->
append("\n")
append(detail)
}
}
return GenieToolObservation(
name = "android_target_$action",
summary = "Requested detached target $action.",
promptDetails = "Requested framework action android_target_$action for session $sessionId.",
name = "android_target_" + action.replace(' ', '_'),
summary = promptDetails.lineSequence().first(),
promptDetails = promptDetails,
)
}
private fun captureDetachedTargetFrame(): GenieToolObservation {
val result = callback.captureDetachedTargetFrame(sessionId)
?: throw IOException("captureDetachedTargetFrame returned null")
val recoveryDetails = mutableListOf<String>()
var capture = DetachedTargetCompat.captureDetachedTargetFrameResult(callback, sessionId)
if (capture.needsRecovery()) {
val recovery = DetachedTargetCompat.ensureDetachedTargetHidden(callback, sessionId)
recoveryDetails += recovery.summary("ensure hidden")
if (recovery.isOk()) {
capture = DetachedTargetCompat.captureDetachedTargetFrameResult(callback, sessionId)
} else {
throw IOException("${capture.summary()} Recovery failed: ${recovery.summary("ensure hidden")}")
}
}
if (!capture.isOk()) {
throw IOException(capture.summary())
}
val result = checkNotNull(capture.captureResult)
val hardwareBuffer = result.hardwareBuffer ?: throw IOException("Detached frame missing hardware buffer")
val bitmap = Bitmap.wrapHardwareBuffer(hardwareBuffer, result.colorSpace)
?: throw IOException("Failed to wrap detached frame")
val copy = bitmap.copy(Bitmap.Config.ARGB_8888, false)
?: throw IOException("Failed to copy detached frame")
val jpeg = ByteArrayOutputStream().use { output ->
if (!copy.compress(Bitmap.CompressFormat.JPEG, 85, output)) {
throw IOException("Failed to encode detached frame")
}
output.toByteArray()
}
val (encodedBitmap, jpeg) = encodeDetachedFrame(copy)
return GenieToolObservation(
name = CAPTURE_TARGET_FRAME_TOOL,
summary = "Captured detached target frame ${copy.width}x${copy.height}.",
promptDetails = "Captured detached target frame ${copy.width}x${copy.height}. Use the attached image to inspect the current UI.",
summary = "Captured detached target frame ${encodedBitmap.width}x${encodedBitmap.height} (${capture.targetRuntime.label}).",
promptDetails = buildString {
append(
"Captured detached target frame ${encodedBitmap.width}x${encodedBitmap.height}. Runtime=${capture.targetRuntime.label}. JPEG=${jpeg.size} bytes.",
)
recoveryDetails.forEach { detail ->
append("\n")
append(detail)
}
append("\nUse the attached image to inspect the current UI.")
},
imageDataUrls = listOf(
"data:image/jpeg;base64," + Base64.encodeToString(jpeg, Base64.NO_WRAP),
),
)
}
private fun encodeDetachedFrame(bitmap: Bitmap): Pair<Bitmap, ByteArray> {
var encodedBitmap = bitmap.downscaleIfNeeded(MAX_CAPTURE_LONG_EDGE)
var quality = INITIAL_JPEG_QUALITY
var jpeg = encodedBitmap.encodeJpeg(quality)
while (jpeg.size > MAX_CAPTURE_JPEG_BYTES && quality > MIN_CAPTURE_JPEG_QUALITY) {
quality -= 7
jpeg = encodedBitmap.encodeJpeg(quality)
}
while (jpeg.size > MAX_CAPTURE_JPEG_BYTES) {
val nextWidth = max((encodedBitmap.width * 0.8f).roundToInt(), 1)
val nextHeight = max((encodedBitmap.height * 0.8f).roundToInt(), 1)
if (nextWidth == encodedBitmap.width && nextHeight == encodedBitmap.height) {
break
}
val scaled = Bitmap.createScaledBitmap(encodedBitmap, nextWidth, nextHeight, true)
if (encodedBitmap !== bitmap) {
encodedBitmap.recycle()
}
encodedBitmap = scaled
quality = INITIAL_JPEG_QUALITY
jpeg = encodedBitmap.encodeJpeg(quality)
while (jpeg.size > MAX_CAPTURE_JPEG_BYTES && quality > MIN_CAPTURE_JPEG_QUALITY) {
quality -= 7
jpeg = encodedBitmap.encodeJpeg(quality)
}
}
return encodedBitmap to jpeg
}
private fun Bitmap.downscaleIfNeeded(maxLongEdge: Int): Bitmap {
val longEdge = max(width, height)
if (longEdge <= maxLongEdge) {
return this
}
val scale = maxLongEdge.toFloat() / longEdge.toFloat()
val scaledWidth = max((width * scale).roundToInt(), 1)
val scaledHeight = max((height * scale).roundToInt(), 1)
return Bitmap.createScaledBitmap(this, scaledWidth, scaledHeight, true)
}
private fun Bitmap.encodeJpeg(quality: Int): ByteArray {
return ByteArrayOutputStream().use { output ->
if (!compress(Bitmap.CompressFormat.JPEG, quality, output)) {
throw IOException("Failed to encode detached frame")
}
output.toByteArray()
}
}
}

View File

@@ -589,6 +589,7 @@ class CodexAppServerHost(
If a direct command or intent clearly accomplishes the objective, stop and report success instead of continuing exploratory UI actions.
The Genie may request detached target launch through the framework callback, and after that it should treat the target as already launched by the framework.
Use detached-target tools to show or inspect the target, then continue with supported shell input and inspection surfaces rather than relaunching the target package.
If detached recovery is needed because the target disappeared, use android_target_ensure_hidden before retrying UI inspection.
Use Android dynamic tools only for framework-only detached target operations that do not have a working shell equivalent in the paired app sandbox.
$detachedSessionInstructions
The delegated objective may include a required final target presentation such as ATTACHED, DETACHED_HIDDEN, or DETACHED_SHOWN. Treat that as a hard completion requirement and do not report success until the framework session actually matches it.
@@ -607,6 +608,7 @@ class CodexAppServerHost(
Detached-session requirement:
- The framework already launched ${request.targetPackage} hidden for this session.
- Do not relaunch ${request.targetPackage} with shell launch commands. Use framework target controls plus UI inspection and input instead.
- If the detached target disappears or looks empty, use android_target_ensure_hidden to request framework-owned recovery.
""".trimIndent()
} else {
""
@@ -623,6 +625,7 @@ class CodexAppServerHost(
private fun buildDynamicToolSpecs(): JSONArray {
return JSONArray()
.put(dynamicToolSpec(AndroidGenieToolExecutor.ENSURE_HIDDEN_TARGET_TOOL, "Ensure the detached target exists and remains hidden. Use this to restore a missing detached target.", emptyObjectSchema()))
.put(dynamicToolSpec(AndroidGenieToolExecutor.SHOW_TARGET_TOOL, "Show the detached target window.", emptyObjectSchema()))
.put(dynamicToolSpec(AndroidGenieToolExecutor.HIDE_TARGET_TOOL, "Hide the detached target window.", emptyObjectSchema()))
.put(dynamicToolSpec(AndroidGenieToolExecutor.ATTACH_TARGET_TOOL, "Reattach the detached target back to the main display.", emptyObjectSchema()))

View File

@@ -4,6 +4,7 @@ import android.app.agent.AgentSessionInfo
import android.app.agent.GenieRequest
import android.app.agent.GenieService
import android.util.Log
import com.openai.codex.bridge.DetachedTargetCompat
import java.io.IOException
import java.util.concurrent.ConcurrentHashMap
@@ -53,11 +54,14 @@ class CodexGenieService : GenieService() {
)
if (request.isDetachedModeAllowed) {
callback.requestLaunchDetachedTargetHidden(sessionId)
callback.publishTrace(sessionId, "Requested detached target launch for ${request.targetPackage}.")
val detachedLaunch = DetachedTargetCompat.ensureDetachedTargetHidden(callback, sessionId)
callback.publishTrace(sessionId, detachedLaunch.summary("ensure hidden"))
check(detachedLaunch.isOk()) {
"Failed to prepare detached target for ${request.targetPackage}: ${detachedLaunch.summary("ensure hidden")}"
}
callback.publishTrace(
sessionId,
"Detached-session contract active for ${request.targetPackage}: the framework already launched the target hidden. Codex must use framework target controls plus UI inspection/input, not plain shell relaunches of the target package.",
"Detached-session contract active for ${request.targetPackage}: the framework owns detached launch and recovery. Codex must use framework target controls plus UI inspection/input, not plain shell relaunches of the target package.",
)
}

View File

@@ -9,8 +9,9 @@ internal object DetachedSessionGuard {
- The framework already launched $targetPackage hidden before your turn started.
- Do not relaunch $targetPackage with `am start`, `cmd activity start-activity`, `monkey -p`, or similar shell launch surfaces. That bypasses detached hosting and can be blocked by Android background-activity-launch policy.
- To surface the running target, use `android_target_show`.
- If the detached target disappears or the framework reports it missing, use `android_target_ensure_hidden` to request framework-owned recovery.
- To inspect the running detached target, use `android_target_capture_frame` and UI-inspection commands such as `uiautomator dump`.
- If the detached target appears missing or unresponsive, do not relaunch it with shell. Use framework target controls first; if they still do not expose a usable target, report the framework-state problem instead of guessing.
- Do not infer missing-target state from a blank launcher badge or a null frame alone. Use framework target controls first; if they still do not expose a usable target, report the framework-state problem instead of guessing.
""".trimIndent()
}
@@ -37,6 +38,6 @@ internal object DetachedSessionGuard {
targetPackage: String,
command: String,
): String {
return "Detached session contract violated: attempted to relaunch $targetPackage with shell command `$command`. The framework already launched the target hidden; use android_target_show/android_target_capture_frame plus UI inspection/input instead."
return "Detached session contract violated: attempted to relaunch $targetPackage with shell command `$command`. The framework already launched the target hidden; use android_target_ensure_hidden/android_target_show/android_target_capture_frame plus UI inspection/input instead."
}
}

View File

@@ -11,6 +11,7 @@ class DetachedSessionGuardTest {
assertTrue(instructions.contains("com.aurora.store"))
assertTrue(instructions.contains("Do not relaunch"))
assertTrue(instructions.contains("android_target_ensure_hidden"))
assertTrue(instructions.contains("android_target_show"))
}

View File

@@ -78,6 +78,10 @@ The current repo now contains these implementation slices:
launched the target hidden, Codex must not relaunch that same target package
with plain shell launchers. That bypasses detached hosting and can be blocked
by Android background-activity-launch policy.
- Detached-session state now distinguishes presentation from runtime existence.
The app consumes `targetRuntime`, typed detached control results, and typed
capture results so a missing detached target can be restored through
framework-owned recovery instead of guessed ordinary app launch.
The Android app now owns auth origination, runtime status, and per-session
transport configuration handoff. Active Genie model traffic is framework-owned.
@@ -177,8 +181,8 @@ the Android Agent/Genie flow.
- Per-session framework transport provisioning in
`android/bridge/src/main/java/com/openai/codex/bridge/FrameworkSessionTransportCompat.kt`
- Framework-only Android dynamic tools registered on the Genie Codex thread with:
- detached target show/hide/attach/close
- detached frame capture
- detached target ensure-hidden/show/hide/attach/close
- typed detached frame capture with runtime-aware recovery
- `request_user_input` bridged from hosted Codex back into AgentSDK questions
- Agent-owned question notifications for Genie questions that need user input
- Agent-mediated free-form answers for Genie questions, using the hosted Agent
@@ -236,12 +240,6 @@ Set the Agent Platform stub SDK zip path:
export ANDROID_AGENT_PLATFORM_STUB_SDK_ZIP=/path/to/android-agent-platform-stub-sdk.zip
```
Build both Android binaries first:
```bash
just android-build
```
Build both Android apps:
```bash
@@ -249,8 +247,9 @@ cd android
./gradlew :genie:assembleDebug :app:assembleDebug
```
The Agent app and Genie app both depend on `just android-build` for the
packaged `codex` JNI binaries.
The Android Gradle build depends on `just android-build` for the packaged
`codex` JNI binaries and will run it automatically when the native artifact is
out of date.
## Next Implementation Steps